Commit | Line | Data |
---|---|---|
a0d0e21e LW |
1 | =head1 NAME |
2 | ||
3 | perlsec - Perl security | |
4 | ||
5 | =head1 DESCRIPTION | |
6 | ||
425e5e39 | 7 | Perl is designed to make it easy to program securely even when running |
8 | with extra privileges, like setuid or setgid programs. Unlike most | |
54310121 | 9 | command line shells, which are based on multiple substitution passes on |
425e5e39 | 10 | each line of the script, Perl uses a more conventional evaluation scheme |
11 | with fewer hidden snags. Additionally, because the language has more | |
54310121 | 12 | builtin functionality, it can rely less upon external (and possibly |
425e5e39 | 13 | untrustworthy) programs to accomplish its purposes. |
a0d0e21e | 14 | |
89f530a6 DG |
15 | =head1 SECURITY VULNERABILITY CONTACT INFORMATION |
16 | ||
87c118b9 DM |
17 | If you believe you have found a security vulnerability in Perl, please |
18 | email the details to perl5-security-report@perl.org. This creates a new | |
19 | Request Tracker ticket in a special queue which isn't initially publicly | |
20 | accessible. The email will also be copied to a closed subscription | |
21 | unarchived mailing list which includes all the core committers, who will | |
22 | be able to help assess the impact of issues, figure out a resolution, and | |
23 | help co-ordinate the release of patches to mitigate or fix the problem | |
24 | across all platforms on which Perl is supported. Please only use this | |
25 | address for security issues in the Perl core, not for modules | |
26 | independently distributed on CPAN. | |
27 | ||
28 | When sending an initial request to the security email address, please | |
29 | don't Cc any other parties, because if they reply to all, the reply will | |
30 | generate yet another new ticket. Once you have received an initial reply | |
31 | with a C<[perl #NNNNNN]> ticket number in the headline, it's okay to Cc | |
32 | subsequent replies to third parties: all emails to the | |
33 | perl5-security-report address with the ticket number in the subject line | |
34 | will be added to the ticket; without it, a new ticket will be created. | |
89f530a6 DG |
35 | |
36 | =head1 SECURITY MECHANISMS AND CONCERNS | |
37 | ||
38 | =head2 Taint mode | |
39 | ||
425e5e39 | 40 | Perl automatically enables a set of special security checks, called I<taint |
41 | mode>, when it detects its program running with differing real and effective | |
42 | user or group IDs. The setuid bit in Unix permissions is mode 04000, the | |
43 | setgid bit mode 02000; either or both may be set. You can also enable taint | |
91e64913 | 44 | mode explicitly by using the B<-T> command line flag. This flag is |
425e5e39 | 45 | I<strongly> suggested for server programs and any program run on behalf of |
91e64913 | 46 | someone else, such as a CGI script. Once taint mode is on, it's on for |
fb73857a | 47 | the remainder of your script. |
a0d0e21e | 48 | |
1e422769 | 49 | While in this mode, Perl takes special precautions called I<taint |
50 | checks> to prevent both obvious and subtle traps. Some of these checks | |
51 | are reasonably simple, such as verifying that path directories aren't | |
52 | writable by others; careful programmers have always used checks like | |
53 | these. Other checks, however, are best supported by the language itself, | |
fb73857a | 54 | and it is these checks especially that contribute to making a set-id Perl |
425e5e39 | 55 | program more secure than the corresponding C program. |
56 | ||
fb73857a | 57 | You may not use data derived from outside your program to affect |
58 | something else outside your program--at least, not by accident. All | |
59 | command line arguments, environment variables, locale information (see | |
23634c10 AL |
60 | L<perllocale>), results of certain system calls (C<readdir()>, |
61 | C<readlink()>, the variable of C<shmread()>, the messages returned by | |
62 | C<msgrcv()>, the password, gcos and shell fields returned by the | |
63 | C<getpwxxx()> calls), and all file input are marked as "tainted". | |
41d6edb2 JH |
64 | Tainted data may not be used directly or indirectly in any command |
65 | that invokes a sub-shell, nor in any command that modifies files, | |
b7ee89ce AP |
66 | directories, or processes, B<with the following exceptions>: |
67 | ||
68 | =over 4 | |
69 | ||
70 | =item * | |
71 | ||
b7ee89ce AP |
72 | Arguments to C<print> and C<syswrite> are B<not> checked for taintedness. |
73 | ||
7f6513c1 JH |
74 | =item * |
75 | ||
76 | Symbolic methods | |
77 | ||
78 | $obj->$method(@args); | |
79 | ||
80 | and symbolic sub references | |
81 | ||
82 | &{$foo}(@args); | |
83 | $foo->(@args); | |
84 | ||
85 | are not checked for taintedness. This requires extra carefulness | |
86 | unless you want external data to affect your control flow. Unless | |
87 | you carefully limit what these symbolic values are, people are able | |
88 | to call functions B<outside> your Perl code, such as POSIX::system, | |
89 | in which case they are able to run arbitrary external code. | |
90 | ||
8ea1447c RD |
91 | =item * |
92 | ||
93 | Hash keys are B<never> tainted. | |
94 | ||
b7ee89ce AP |
95 | =back |
96 | ||
595bde10 MG |
97 | For efficiency reasons, Perl takes a conservative view of |
98 | whether data is tainted. If an expression contains tainted data, | |
99 | any subexpression may be considered tainted, even if the value | |
100 | of the subexpression is not itself affected by the tainted data. | |
ee556d55 | 101 | |
d929ce6f | 102 | Because taintedness is associated with each scalar value, some |
595bde10 | 103 | elements of an array or hash can be tainted and others not. |
8ea1447c | 104 | The keys of a hash are B<never> tainted. |
a0d0e21e | 105 | |
a0d0e21e LW |
106 | For example: |
107 | ||
425e5e39 | 108 | $arg = shift; # $arg is tainted |
048b63be | 109 | $hid = $arg . 'bar'; # $hid is also tainted |
425e5e39 | 110 | $line = <>; # Tainted |
8ebc5c01 | 111 | $line = <STDIN>; # Also tainted |
112 | open FOO, "/home/me/bar" or die $!; | |
113 | $line = <FOO>; # Still tainted | |
a0d0e21e | 114 | $path = $ENV{'PATH'}; # Tainted, but see below |
425e5e39 | 115 | $data = 'abc'; # Not tainted |
a0d0e21e | 116 | |
425e5e39 | 117 | system "echo $arg"; # Insecure |
7de90c4d | 118 | system "/bin/echo", $arg; # Considered insecure |
bbd7eb8a | 119 | # (Perl doesn't know about /bin/echo) |
425e5e39 | 120 | system "echo $hid"; # Insecure |
121 | system "echo $data"; # Insecure until PATH set | |
a0d0e21e | 122 | |
425e5e39 | 123 | $path = $ENV{'PATH'}; # $path now tainted |
a0d0e21e | 124 | |
54310121 | 125 | $ENV{'PATH'} = '/bin:/usr/bin'; |
c90c0ff4 | 126 | delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'}; |
a0d0e21e | 127 | |
425e5e39 | 128 | $path = $ENV{'PATH'}; # $path now NOT tainted |
129 | system "echo $data"; # Is secure now! | |
a0d0e21e | 130 | |
425e5e39 | 131 | open(FOO, "< $arg"); # OK - read-only file |
132 | open(FOO, "> $arg"); # Not OK - trying to write | |
a0d0e21e | 133 | |
bbd7eb8a | 134 | open(FOO,"echo $arg|"); # Not OK |
425e5e39 | 135 | open(FOO,"-|") |
7de90c4d | 136 | or exec 'echo', $arg; # Also not OK |
a0d0e21e | 137 | |
425e5e39 | 138 | $shout = `echo $arg`; # Insecure, $shout now tainted |
a0d0e21e | 139 | |
425e5e39 | 140 | unlink $data, $arg; # Insecure |
141 | umask $arg; # Insecure | |
a0d0e21e | 142 | |
bbd7eb8a | 143 | exec "echo $arg"; # Insecure |
7de90c4d RD |
144 | exec "echo", $arg; # Insecure |
145 | exec "sh", '-c', $arg; # Very insecure! | |
a0d0e21e | 146 | |
3a4b19e4 GS |
147 | @files = <*.c>; # insecure (uses readdir() or similar) |
148 | @files = glob('*.c'); # insecure (uses readdir() or similar) | |
7bac28a0 | 149 | |
dde0c558 BF |
150 | # In either case, the results of glob are tainted, since the list of |
151 | # filenames comes from outside of the program. | |
3f7d42d8 | 152 | |
ee556d55 MG |
153 | $bad = ($arg, 23); # $bad will be tainted |
154 | $arg, `true`; # Insecure (although it isn't really) | |
155 | ||
a0d0e21e | 156 | If you try to do something insecure, you will get a fatal error saying |
7de90c4d | 157 | something like "Insecure dependency" or "Insecure $ENV{PATH}". |
425e5e39 | 158 | |
23634c10 AL |
159 | The exception to the principle of "one tainted value taints the whole |
160 | expression" is with the ternary conditional operator C<?:>. Since code | |
161 | with a ternary conditional | |
162 | ||
163 | $result = $tainted_value ? "Untainted" : "Also untainted"; | |
164 | ||
165 | is effectively | |
166 | ||
167 | if ( $tainted_value ) { | |
168 | $result = "Untainted"; | |
169 | } else { | |
170 | $result = "Also untainted"; | |
171 | } | |
172 | ||
173 | it doesn't make sense for C<$result> to be tainted. | |
174 | ||
425e5e39 | 175 | =head2 Laundering and Detecting Tainted Data |
176 | ||
3f7d42d8 JH |
177 | To test whether a variable contains tainted data, and whose use would |
178 | thus trigger an "Insecure dependency" message, you can use the | |
23634c10 | 179 | C<tainted()> function of the Scalar::Util module, available in your |
3f7d42d8 | 180 | nearby CPAN mirror, and included in Perl starting from the release 5.8.0. |
595bde10 | 181 | Or you may be able to use the following C<is_tainted()> function. |
425e5e39 | 182 | |
183 | sub is_tainted { | |
7687d286 | 184 | local $@; # Don't pollute caller's value. |
61890e45 | 185 | return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 }; |
425e5e39 | 186 | } |
187 | ||
188 | This function makes use of the fact that the presence of tainted data | |
189 | anywhere within an expression renders the entire expression tainted. It | |
190 | would be inefficient for every operator to test every argument for | |
191 | taintedness. Instead, the slightly more efficient and conservative | |
192 | approach is used that if any tainted value has been accessed within the | |
193 | same expression, the whole expression is considered tainted. | |
194 | ||
5f05dabc | 195 | But testing for taintedness gets you only so far. Sometimes you have just |
595bde10 MG |
196 | to clear your data's taintedness. Values may be untainted by using them |
197 | as keys in a hash; otherwise the only way to bypass the tainting | |
54310121 | 198 | mechanism is by referencing subpatterns from a regular expression match. |
18512f39 KW |
199 | Perl presumes that if you reference a substring using $1, $2, etc. in a |
200 | non-tainting pattern, that | |
201 | you knew what you were doing when you wrote that pattern. That means using | |
425e5e39 | 202 | a bit of thought--don't just blindly untaint anything, or you defeat the |
a034a98d DD |
203 | entire mechanism. It's better to verify that the variable has only good |
204 | characters (for certain values of "good") rather than checking whether it | |
205 | has any bad characters. That's because it's far too easy to miss bad | |
206 | characters that you never thought of. | |
425e5e39 | 207 | |
208 | Here's a test to make sure that the data contains nothing but "word" | |
209 | characters (alphabetics, numerics, and underscores), a hyphen, an at sign, | |
210 | or a dot. | |
211 | ||
54310121 | 212 | if ($data =~ /^([-\@\w.]+)$/) { |
425e5e39 | 213 | $data = $1; # $data now untainted |
214 | } else { | |
3a2263fe | 215 | die "Bad data in '$data'"; # log this somewhere |
425e5e39 | 216 | } |
217 | ||
5f05dabc | 218 | This is fairly secure because C</\w+/> doesn't normally match shell |
425e5e39 | 219 | metacharacters, nor are dot, dash, or at going to mean something special |
220 | to the shell. Use of C</.+/> would have been insecure in theory because | |
221 | it lets everything through, but Perl doesn't check for that. The lesson | |
222 | is that when untainting, you must be exceedingly careful with your patterns. | |
19799a22 | 223 | Laundering data using regular expression is the I<only> mechanism for |
425e5e39 | 224 | untainting dirty data, unless you use the strategy detailed below to fork |
225 | a child of lesser privilege. | |
226 | ||
23634c10 | 227 | The example does not untaint C<$data> if C<use locale> is in effect, |
a034a98d DD |
228 | because the characters matched by C<\w> are determined by the locale. |
229 | Perl considers that locale definitions are untrustworthy because they | |
230 | contain data from outside the program. If you are writing a | |
231 | locale-aware program, and want to launder data with a regular expression | |
232 | containing C<\w>, put C<no locale> ahead of the expression in the same | |
233 | block. See L<perllocale/SECURITY> for further discussion and examples. | |
234 | ||
3a52c276 CS |
235 | =head2 Switches On the "#!" Line |
236 | ||
237 | When you make a script executable, in order to make it usable as a | |
238 | command, the system will pass switches to perl from the script's #! | |
54310121 | 239 | line. Perl checks that any command line switches given to a setuid |
3a52c276 | 240 | (or setgid) script actually match the ones set on the #! line. Some |
54310121 | 241 | Unix and Unix-like environments impose a one-switch limit on the #! |
3a52c276 | 242 | line, so you may need to use something like C<-wU> instead of C<-w -U> |
54310121 | 243 | under such systems. (This issue should arise only in Unix or |
244 | Unix-like environments that support #! and setuid or setgid scripts.) | |
3a52c276 | 245 | |
588f7210 SB |
246 | =head2 Taint mode and @INC |
247 | ||
f7335192 DC |
248 | When the taint mode (C<-T>) is in effect, the environment variables |
249 | C<PERL5LIB> and C<PERLLIB> | |
91e64913 | 250 | are ignored by Perl. You can still adjust C<@INC> from outside the |
588f7210 | 251 | program by using the C<-I> command line option as explained in |
028611fa DB |
252 | L<perlrun|perlrun/-Idirectory>. The two environment variables are |
253 | ignored because they are obscured, and a user running a program could | |
254 | be unaware that they are set, whereas the C<-I> option is clearly | |
255 | visible and therefore permitted. | |
588f7210 SB |
256 | |
257 | Another way to modify C<@INC> without modifying the program, is to use | |
258 | the C<lib> pragma, e.g.: | |
259 | ||
260 | perl -Mlib=/foo program | |
261 | ||
262 | The benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former | |
6fd9f613 | 263 | will automagically remove any duplicated directories, while the latter |
588f7210 SB |
264 | will not. |
265 | ||
6a268663 RGS |
266 | Note that if a tainted string is added to C<@INC>, the following |
267 | problem will be reported: | |
268 | ||
269 | Insecure dependency in require while running with -T switch | |
270 | ||
f7335192 | 271 | On versions of Perl before 5.26, activating taint mode will also remove |
a1c1fa25 DC |
272 | the current directory (".") from the default value of C<@INC>. Since |
273 | version 5.26, the current directory isn't included in C<@INC> by | |
274 | default. | |
f7335192 | 275 | |
425e5e39 | 276 | =head2 Cleaning Up Your Path |
277 | ||
df98f984 RGS |
278 | For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to |
279 | a known value, and each directory in the path must be absolute and | |
280 | non-writable by others than its owner and group. You may be surprised to | |
281 | get this message even if the pathname to your executable is fully | |
282 | qualified. This is I<not> generated because you didn't supply a full path | |
283 | to the program; instead, it's generated because you never set your PATH | |
284 | environment variable, or you didn't set it to something that was safe. | |
285 | Because Perl can't guarantee that the executable in question isn't itself | |
286 | going to turn around and execute some other program that is dependent on | |
287 | your PATH, it makes sure you set the PATH. | |
a0d0e21e | 288 | |
a3cb178b GS |
289 | The PATH isn't the only environment variable which can cause problems. |
290 | Because some shells may use the variables IFS, CDPATH, ENV, and | |
291 | BASH_ENV, Perl checks that those are either empty or untainted when | |
91e64913 | 292 | starting subprocesses. You may wish to add something like this to your |
a3cb178b GS |
293 | setid and taint-checking scripts. |
294 | ||
295 | delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer | |
296 | ||
a0d0e21e LW |
297 | It's also possible to get into trouble with other operations that don't |
298 | care whether they use tainted values. Make judicious use of the file | |
299 | tests in dealing with any user-supplied filenames. When possible, do | |
fb73857a | 300 | opens and such B<after> properly dropping any special user (or group!) |
91e64913 FC |
301 | privileges. Perl doesn't prevent you from |
302 | opening tainted filenames for reading, | |
a0d0e21e LW |
303 | so be careful what you print out. The tainting mechanism is intended to |
304 | prevent stupid mistakes, not to remove the need for thought. | |
305 | ||
23634c10 AL |
306 | Perl does not call the shell to expand wild cards when you pass C<system> |
307 | and C<exec> explicit parameter lists instead of strings with possible shell | |
308 | wildcards in them. Unfortunately, the C<open>, C<glob>, and | |
54310121 | 309 | backtick functions provide no such alternate calling convention, so more |
310 | subterfuge will be required. | |
425e5e39 | 311 | |
312 | Perl provides a reasonably safe way to open a file or pipe from a setuid | |
313 | or setgid program: just create a child process with reduced privilege who | |
314 | does the dirty work for you. First, fork a child using the special | |
23634c10 | 315 | C<open> syntax that connects the parent and child by a pipe. Now the |
425e5e39 | 316 | child resets its ID set and any other per-process attributes, like |
317 | environment variables, umasks, current working directories, back to the | |
318 | originals or known safe values. Then the child process, which no longer | |
23634c10 | 319 | has any special permissions, does the C<open> or other system call. |
425e5e39 | 320 | Finally, the child passes the data it managed to access back to the |
5f05dabc | 321 | parent. Because the file or pipe was opened in the child while running |
425e5e39 | 322 | under less privilege than the parent, it's not apt to be tricked into |
323 | doing something it shouldn't. | |
324 | ||
23634c10 | 325 | Here's a way to do backticks reasonably safely. Notice how the C<exec> is |
425e5e39 | 326 | not called with a string that the shell could expand. This is by far the |
327 | best way to call something that might be subjected to shell escapes: just | |
fb73857a | 328 | never call the shell at all. |
cb1a09d0 | 329 | |
6ca3c6c6 | 330 | use English; |
e093bcf0 GW |
331 | die "Can't fork: $!" unless defined($pid = open(KID, "-|")); |
332 | if ($pid) { # parent | |
333 | while (<KID>) { | |
334 | # do something | |
335 | } | |
336 | close KID; | |
337 | } else { | |
338 | my @temp = ($EUID, $EGID); | |
339 | my $orig_uid = $UID; | |
340 | my $orig_gid = $GID; | |
341 | $EUID = $UID; | |
342 | $EGID = $GID; | |
343 | # Drop privileges | |
344 | $UID = $orig_uid; | |
345 | $GID = $orig_gid; | |
346 | # Make sure privs are really gone | |
347 | ($EUID, $EGID) = @temp; | |
348 | die "Can't drop privileges" | |
349 | unless $UID == $EUID && $GID eq $EGID; | |
350 | $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH. | |
351 | # Consider sanitizing the environment even more. | |
352 | exec 'myprog', 'arg1', 'arg2' | |
353 | or die "can't exec myprog: $!"; | |
354 | } | |
425e5e39 | 355 | |
fb73857a | 356 | A similar strategy would work for wildcard expansion via C<glob>, although |
357 | you can use C<readdir> instead. | |
425e5e39 | 358 | |
359 | Taint checking is most useful when although you trust yourself not to have | |
360 | written a program to give away the farm, you don't necessarily trust those | |
361 | who end up using it not to try to trick it into doing something bad. This | |
fb73857a | 362 | is the kind of security checking that's useful for set-id programs and |
425e5e39 | 363 | programs launched on someone else's behalf, like CGI programs. |
364 | ||
365 | This is quite different, however, from not even trusting the writer of the | |
366 | code not to try to do something evil. That's the kind of trust needed | |
367 | when someone hands you a program you've never seen before and says, "Here, | |
18d7fc85 RGS |
368 | run this." For that kind of safety, you might want to check out the Safe |
369 | module, included standard in the Perl distribution. This module allows the | |
425e5e39 | 370 | programmer to set up special compartments in which all system operations |
18d7fc85 RGS |
371 | are trapped and namespace access is carefully controlled. Safe should |
372 | not be considered bullet-proof, though: it will not prevent the foreign | |
373 | code to set up infinite loops, allocate gigabytes of memory, or even | |
374 | abusing perl bugs to make the host interpreter crash or behave in | |
91e64913 | 375 | unpredictable ways. In any case it's better avoided completely if you're |
18d7fc85 | 376 | really concerned about security. |
425e5e39 | 377 | |
b5145c7d | 378 | =head2 Shebang Race Condition |
425e5e39 | 379 | |
380 | Beyond the obvious problems that stem from giving special privileges to | |
fb73857a | 381 | systems as flexible as scripts, on many versions of Unix, set-id scripts |
425e5e39 | 382 | are inherently insecure right from the start. The problem is a race |
383 | condition in the kernel. Between the time the kernel opens the file to | |
fb73857a | 384 | see which interpreter to run and when the (now-set-id) interpreter turns |
425e5e39 | 385 | around and reopens the file to interpret it, the file in question may have |
386 | changed, especially if you have symbolic links on your system. | |
387 | ||
dabde021 | 388 | Some Unixes, especially more recent ones, are free of this |
b5145c7d Z |
389 | inherent security bug. On such systems, when the kernel passes the name |
390 | of the set-id script to open to the interpreter, rather than using a | |
391 | pathname subject to meddling, it instead passes I</dev/fd/3>. This is a | |
392 | special file already opened on the script, so that there can be no race | |
393 | condition for evil scripts to exploit. On these systems, Perl should be | |
394 | compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The F<Configure> | |
395 | program that builds Perl tries to figure this out for itself, so you | |
396 | should never have to specify this yourself. Most modern releases of | |
397 | SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition. | |
425e5e39 | 398 | |
b5145c7d Z |
399 | If you don't have the safe version of set-id scripts, all is not lost. |
400 | Sometimes this kernel "feature" can be disabled, so that the kernel | |
401 | either doesn't run set-id scripts with the set-id or doesn't run them | |
402 | at all. Either way avoids the exploitability of the race condition, | |
403 | but doesn't help in actually running scripts set-id. | |
404 | ||
405 | If the kernel set-id script feature isn't disabled, then any set-id | |
406 | script provides an exploitable vulnerability. Perl can't avoid being | |
407 | exploitable, but will point out vulnerable scripts where it can. If Perl | |
408 | detects that it is being applied to a set-id script then it will complain | |
409 | loudly that your set-id script is insecure, and won't run it. When Perl | |
410 | complains, you need to remove the set-id bit from the script to eliminate | |
411 | the vulnerability. Refusing to run the script doesn't in itself close | |
412 | the vulnerability; it is just Perl's way of encouraging you to do this. | |
413 | ||
414 | To actually run a script set-id, if you don't have the safe version of | |
415 | set-id scripts, you'll need to put a C wrapper around | |
425e5e39 | 416 | the script. A C wrapper is just a compiled program that does nothing |
417 | except call your Perl program. Compiled programs are not subject to the | |
fb73857a | 418 | kernel bug that plagues set-id scripts. Here's a simple wrapper, written |
425e5e39 | 419 | in C: |
420 | ||
245c138e LM |
421 | #include <unistd.h> |
422 | #include <stdio.h> | |
423 | #include <string.h> | |
424 | #include <errno.h> | |
425 | ||
425e5e39 | 426 | #define REAL_PATH "/path/to/script" |
245c138e LM |
427 | |
428 | int main(int argc, char **argv) | |
425e5e39 | 429 | { |
245c138e LM |
430 | execv(REAL_PATH, argv); |
431 | fprintf(stderr, "%s: %s: %s\n", | |
432 | argv[0], REAL_PATH, strerror(errno)); | |
433 | return 127; | |
54310121 | 434 | } |
cb1a09d0 | 435 | |
54310121 | 436 | Compile this wrapper into a binary executable and then make I<it> rather |
b5145c7d | 437 | than your script setuid or setgid. Note that this wrapper isn't doing |
dabde021 | 438 | anything to sanitise the execution environment other than ensuring |
b5145c7d Z |
439 | that a safe path to the script is used. It only avoids the shebang |
440 | race condition. It relies on Perl's own features, and on the script | |
441 | itself being careful, to make it safe enough to run the script set-id. | |
425e5e39 | 442 | |
68dc0745 | 443 | =head2 Protecting Your Programs |
444 | ||
445 | There are a number of ways to hide the source to your Perl programs, | |
446 | with varying levels of "security". | |
447 | ||
448 | First of all, however, you I<can't> take away read permission, because | |
449 | the source code has to be readable in order to be compiled and | |
450 | interpreted. (That doesn't mean that a CGI script's source is | |
451 | readable by people on the web, though.) So you have to leave the | |
5a964f20 TC |
452 | permissions at the socially friendly 0755 level. This lets |
453 | people on your local system only see your source. | |
68dc0745 | 454 | |
5a964f20 | 455 | Some people mistakenly regard this as a security problem. If your program does |
68dc0745 | 456 | insecure things, and relies on people not knowing how to exploit those |
457 | insecurities, it is not secure. It is often possible for someone to | |
458 | determine the insecure things and exploit them without viewing the | |
459 | source. Security through obscurity, the name for hiding your bugs | |
460 | instead of fixing them, is little security indeed. | |
461 | ||
83df6a1d JH |
462 | You can try using encryption via source filters (Filter::* from CPAN, |
463 | or Filter::Util::Call and Filter::Simple since Perl 5.8). | |
464 | But crackers might be able to decrypt it. You can try using the byte | |
465 | code compiler and interpreter described below, but crackers might be | |
466 | able to de-compile it. You can try using the native-code compiler | |
68dc0745 | 467 | described below, but crackers might be able to disassemble it. These |
468 | pose varying degrees of difficulty to people wanting to get at your | |
469 | code, but none can definitively conceal it (this is true of every | |
470 | language, not just Perl). | |
471 | ||
472 | If you're concerned about people profiting from your code, then the | |
3462340b | 473 | bottom line is that nothing but a restrictive license will give you |
68dc0745 | 474 | legal security. License your software and pepper it with threatening |
475 | statements like "This is unpublished proprietary software of XYZ Corp. | |
476 | Your access to it does not give you permission to use it blah blah | |
3462340b | 477 | blah." You should see a lawyer to be sure your license's wording will |
68dc0745 | 478 | stand up in court. |
5a964f20 | 479 | |
0d7c09bb JH |
480 | =head2 Unicode |
481 | ||
482 | Unicode is a new and complex technology and one may easily overlook | |
483 | certain security pitfalls. See L<perluniintro> for an overview and | |
484 | L<perlunicode> for details, and L<perlunicode/"Security Implications | |
485 | of Unicode"> for security implications in particular. | |
486 | ||
504f80c1 JH |
487 | =head2 Algorithmic Complexity Attacks |
488 | ||
489 | Certain internal algorithms used in the implementation of Perl can | |
490 | be attacked by choosing the input carefully to consume large amounts | |
491 | of either time or space or both. This can lead into the so-called | |
492 | I<Denial of Service> (DoS) attacks. | |
493 | ||
494 | =over 4 | |
495 | ||
496 | =item * | |
497 | ||
6a5b4183 YO |
498 | Hash Algorithm - Hash algorithms like the one used in Perl are well |
499 | known to be vulnerable to collision attacks on their hash function. | |
500 | Such attacks involve constructing a set of keys which collide into | |
91e64913 | 501 | the same bucket producing inefficient behavior. Such attacks often |
6a5b4183 | 502 | depend on discovering the seed of the hash function used to map the |
91e64913 FC |
503 | keys to buckets. That seed is then used to brute-force a key set which |
504 | can be used to mount a denial of service attack. In Perl 5.8.1 changes | |
6a5b4183 YO |
505 | were introduced to harden Perl to such attacks, and then later in |
506 | Perl 5.18.0 these features were enhanced and additional protections | |
507 | added. | |
508 | ||
4d74c8eb S |
509 | At the time of this writing, Perl 5.18.0 is considered to be |
510 | well-hardened against algorithmic complexity attacks on its hash | |
91e64913 | 511 | implementation. This is largely owed to the following measures |
4d74c8eb | 512 | mitigate attacks: |
6a5b4183 YO |
513 | |
514 | =over 4 | |
515 | ||
516 | =item Hash Seed Randomization | |
517 | ||
518 | In order to make it impossible to know what seed to generate an attack | |
91e64913 | 519 | key set for, this seed is randomly initialized at process start. This |
4d74c8eb | 520 | may be overridden by using the PERL_HASH_SEED environment variable, see |
91e64913 | 521 | L<perlrun/PERL_HASH_SEED>. This environment variable controls how |
4d74c8eb S |
522 | items are actually stored, not how they are presented via |
523 | C<keys>, C<values> and C<each>. | |
6a5b4183 YO |
524 | |
525 | =item Hash Traversal Randomization | |
526 | ||
4d74c8eb | 527 | Independent of which seed is used in the hash function, C<keys>, |
6a5b4183 YO |
528 | C<values>, and C<each> return items in a per-hash randomized order. |
529 | Modifying a hash by insertion will change the iteration order of that hash. | |
4d74c8eb | 530 | This behavior can be overridden by using C<hash_traversal_mask()> from |
6a5b4183 | 531 | L<Hash::Util> or by using the PERL_PERTURB_KEYS environment variable, |
91e64913 | 532 | see L<perlrun/PERL_PERTURB_KEYS>. Note that this feature controls the |
6a5b4183 YO |
533 | "visible" order of the keys, and not the actual order they are stored in. |
534 | ||
535 | =item Bucket Order Perturbance | |
536 | ||
4d74c8eb | 537 | When items collide into a given hash bucket the order they are stored in |
91e64913 FC |
538 | the chain is no longer predictable in Perl 5.18. This |
539 | has the intention to make it harder to observe a | |
c6c886ef | 540 | collision. This behavior can be overridden by using |
6a5b4183 YO |
541 | the PERL_PERTURB_KEYS environment variable, see L<perlrun/PERL_PERTURB_KEYS>. |
542 | ||
543 | =item New Default Hash Function | |
544 | ||
545 | The default hash function has been modified with the intention of making | |
546 | it harder to infer the hash seed. | |
547 | ||
548 | =item Alternative Hash Functions | |
549 | ||
550 | The source code includes multiple hash algorithms to choose from. While we | |
4d74c8eb | 551 | believe that the default perl hash is robust to attack, we have included the |
91e64913 | 552 | hash function Siphash as a fall-back option. At the time of release of |
6a5b4183 YO |
553 | Perl 5.18.0 Siphash is believed to be of cryptographic strength. This is |
554 | not the default as it is much slower than the default hash. | |
555 | ||
556 | =back | |
557 | ||
4d74c8eb | 558 | Without compiling a special Perl, there is no way to get the exact same |
91e64913 | 559 | behavior of any versions prior to Perl 5.18.0. The closest one can get |
6a5b4183 | 560 | is by setting PERL_PERTURB_KEYS to 0 and setting the PERL_HASH_SEED |
91e64913 | 561 | to a known value. We do not advise those settings for production use |
4d74c8eb | 562 | due to the above security considerations. |
6a5b4183 YO |
563 | |
564 | B<Perl has never guaranteed any ordering of the hash keys>, and | |
565 | the ordering has already changed several times during the lifetime of | |
566 | Perl 5. Also, the ordering of hash keys has always been, and continues | |
567 | to be, affected by the insertion order and the history of changes made | |
568 | to the hash over its lifetime. | |
7b3f7037 JH |
569 | |
570 | Also note that while the order of the hash elements might be | |
4d74c8eb S |
571 | randomized, this "pseudo-ordering" should B<not> be used for |
572 | applications like shuffling a list randomly (use C<List::Util::shuffle()> | |
7b3f7037 | 573 | for that, see L<List::Util>, a standard core module since Perl 5.8.0; |
4d74c8eb S |
574 | or the CPAN module C<Algorithm::Numerical::Shuffle>), or for generating |
575 | permutations (use e.g. the CPAN modules C<Algorithm::Permute> or | |
576 | C<Algorithm::FastPermute>), or for any cryptographic applications. | |
7b3f7037 | 577 | |
883f220b TC |
578 | Tied hashes may have their own ordering and algorithmic complexity |
579 | attacks. | |
580 | ||
504f80c1 JH |
581 | =item * |
582 | ||
5a4e8ea7 P |
583 | Regular expressions - Perl's regular expression engine is so called NFA |
584 | (Non-deterministic Finite Automaton), which among other things means that | |
585 | it can rather easily consume large amounts of both time and space if the | |
504f80c1 JH |
586 | regular expression may match in several ways. Careful crafting of the |
587 | regular expressions can help but quite often there really isn't much | |
588 | one can do (the book "Mastering Regular Expressions" is required | |
589 | reading, see L<perlfaq2>). Running out of space manifests itself by | |
590 | Perl running out of memory. | |
591 | ||
592 | =item * | |
593 | ||
594 | Sorting - the quicksort algorithm used in Perls before 5.8.0 to | |
e2091bb6 | 595 | implement the sort() function was very easy to trick into misbehaving |
3462340b JL |
596 | so that it consumes a lot of time. Starting from Perl 5.8.0 a different |
597 | sorting algorithm, mergesort, is used by default. Mergesort cannot | |
598 | misbehave on any input. | |
504f80c1 JH |
599 | |
600 | =back | |
601 | ||
b25b06cf | 602 | See L<https://www.usenix.org/legacy/events/sec03/tech/full_papers/crosby/crosby.pdf> for more information, |
3462340b | 603 | and any computer science textbook on algorithmic complexity. |
504f80c1 | 604 | |
b5145c7d Z |
605 | =head2 Using Sudo |
606 | ||
607 | The popular tool C<sudo> provides a controlled way for users to be able | |
608 | to run programs as other users. It sanitises the execution environment | |
609 | to some extent, and will avoid the L<shebang race condition|/"Shebang | |
610 | Race Condition">. If you don't have the safe version of set-id scripts, | |
611 | then C<sudo> may be a more convenient way of executing a script as | |
612 | another user than writing a C wrapper would be. | |
613 | ||
614 | However, C<sudo> sets the real user or group ID to that of the target | |
615 | identity, not just the effective ID as set-id bits do. As a result, Perl | |
616 | can't detect that it is running under C<sudo>, and so won't automatically | |
617 | take its own security precautions such as turning on taint mode. Where | |
618 | C<sudo> configuration dictates exactly which command can be run, the | |
619 | approved command may include a C<-T> option to perl to enable taint mode. | |
620 | ||
621 | In general, it is necessary to evaluate the suitaility of a script to | |
622 | run under C<sudo> specifically with that kind of execution environment | |
623 | in mind. It is neither necessary nor sufficient for the same script to | |
624 | be suitable to run in a traditional set-id arrangement, though many of | |
625 | the issues overlap. | |
626 | ||
5a964f20 TC |
627 | =head1 SEE ALSO |
628 | ||
028611fa DB |
629 | L<perlrun/ENVIRONMENT> for its description of cleaning up environment |
630 | variables. |