Commit | Line | Data |
---|---|---|
425e5e39 | 1 | |
a0d0e21e LW |
2 | =head1 NAME |
3 | ||
4 | perlsec - Perl security | |
5 | ||
6 | =head1 DESCRIPTION | |
7 | ||
425e5e39 | 8 | Perl is designed to make it easy to program securely even when running |
9 | with extra privileges, like setuid or setgid programs. Unlike most | |
10 | command-line shells, which are based on multiple substitution passes on | |
11 | each line of the script, Perl uses a more conventional evaluation scheme | |
12 | with fewer hidden snags. Additionally, because the language has more | |
13 | built-in functionality, it can rely less upon external (and possibly | |
14 | untrustworthy) programs to accomplish its purposes. | |
a0d0e21e | 15 | |
425e5e39 | 16 | Perl automatically enables a set of special security checks, called I<taint |
17 | mode>, when it detects its program running with differing real and effective | |
18 | user or group IDs. The setuid bit in Unix permissions is mode 04000, the | |
19 | setgid bit mode 02000; either or both may be set. You can also enable taint | |
20 | mode explicitly by using the the B<-T> command line flag. This flag is | |
21 | I<strongly> suggested for server programs and any program run on behalf of | |
22 | someone else, such as a CGI script. | |
a0d0e21e | 23 | |
425e5e39 | 24 | While in this mode, Perl takes special precautions called I<taint |
25 | checks> to prevent both obvious and subtle traps. Some of these checks | |
26 | are reasonably simple, such as verifying that path directories aren't | |
27 | writable by others; careful programmers have always used checks like | |
28 | these. Other checks, however, are best supported by the language itself, | |
29 | and it is these checks especially that contribute to making a setuid Perl | |
30 | program more secure than the corresponding C program. | |
31 | ||
32 | You may not use data derived from outside your program to affect something | |
33 | else outside your program--at least, not by accident. All command-line | |
34 | arguments, environment variables, and file input are marked as "tainted". | |
35 | Tainted data may not be used directly or indirectly in any command that | |
36 | invokes a subshell, nor in any command that modifies files, directories, | |
37 | or processes. Any variable set within an expression that has previously | |
38 | referenced a tainted value itself becomes tainted, even if it is logically | |
39 | impossible for the tainted value to influence the variable. Because | |
40 | taintedness is associated with each scalar value, some elements of an | |
41 | array can be tainted and others not. | |
a0d0e21e | 42 | |
a0d0e21e LW |
43 | For example: |
44 | ||
425e5e39 | 45 | $arg = shift; # $arg is tainted |
46 | $hid = $arg, 'bar'; # $hid is also tainted | |
47 | $line = <>; # Tainted | |
a0d0e21e | 48 | $path = $ENV{'PATH'}; # Tainted, but see below |
425e5e39 | 49 | $data = 'abc'; # Not tainted |
a0d0e21e | 50 | |
425e5e39 | 51 | system "echo $arg"; # Insecure |
52 | system "/bin/echo", $arg; # Secure (doesn't use sh) | |
53 | system "echo $hid"; # Insecure | |
54 | system "echo $data"; # Insecure until PATH set | |
a0d0e21e | 55 | |
425e5e39 | 56 | $path = $ENV{'PATH'}; # $path now tainted |
a0d0e21e | 57 | |
425e5e39 | 58 | $ENV{'PATH'} = '/bin:/usr/bin'; |
59 | $ENV{'IFS'} = '' if $ENV{'IFS'} ne ''; | |
a0d0e21e | 60 | |
425e5e39 | 61 | $path = $ENV{'PATH'}; # $path now NOT tainted |
62 | system "echo $data"; # Is secure now! | |
a0d0e21e | 63 | |
425e5e39 | 64 | open(FOO, "< $arg"); # OK - read-only file |
65 | open(FOO, "> $arg"); # Not OK - trying to write | |
a0d0e21e | 66 | |
425e5e39 | 67 | open(FOO,"echo $arg|"); # Not OK, but... |
68 | open(FOO,"-|") | |
69 | or exec 'echo', $arg; # OK | |
a0d0e21e | 70 | |
425e5e39 | 71 | $shout = `echo $arg`; # Insecure, $shout now tainted |
a0d0e21e | 72 | |
425e5e39 | 73 | unlink $data, $arg; # Insecure |
74 | umask $arg; # Insecure | |
a0d0e21e | 75 | |
425e5e39 | 76 | exec "echo $arg"; # Insecure |
77 | exec "echo", $arg; # Secure (doesn't use the shell) | |
78 | exec "sh", '-c', $arg; # Considered secure, alas! | |
a0d0e21e LW |
79 | |
80 | If you try to do something insecure, you will get a fatal error saying | |
81 | something like "Insecure dependency" or "Insecure PATH". Note that you | |
425e5e39 | 82 | can still write an insecure B<system> or B<exec>, but only by explicitly |
83 | doing something like the last example above. | |
84 | ||
85 | =head2 Laundering and Detecting Tainted Data | |
86 | ||
87 | To test whether a variable contains tainted data, and whose use would thus | |
88 | trigger an "Insecure dependency" message, you can use the following | |
89 | I<is_tainted()> function. | |
90 | ||
91 | sub is_tainted { | |
92 | return ! eval { | |
93 | join('',@_), kill 0; | |
94 | 1; | |
95 | }; | |
96 | } | |
97 | ||
98 | This function makes use of the fact that the presence of tainted data | |
99 | anywhere within an expression renders the entire expression tainted. It | |
100 | would be inefficient for every operator to test every argument for | |
101 | taintedness. Instead, the slightly more efficient and conservative | |
102 | approach is used that if any tainted value has been accessed within the | |
103 | same expression, the whole expression is considered tainted. | |
104 | ||
105 | But testing for taintedness only gets you so far. Sometimes you just have | |
106 | to clear your data's taintedness. The only way to bypass the tainting | |
107 | mechanism is by referencing subpatterns from a regular expression match. | |
108 | Perl presumes that if you reference a substring using $1, $2, etc., that | |
109 | you knew what you were doing when you wrote the pattern. That means using | |
110 | a bit of thought--don't just blindly untaint anything, or you defeat the | |
111 | entire mechanism. It's better to verify that the variable has only | |
112 | good characters (for certain values of "good") rather than checking | |
113 | whether it has any bad characters. That's because it's far too easy to | |
114 | miss bad characters that you never thought of. | |
115 | ||
116 | Here's a test to make sure that the data contains nothing but "word" | |
117 | characters (alphabetics, numerics, and underscores), a hyphen, an at sign, | |
118 | or a dot. | |
119 | ||
120 | if ($data =~ /^([-\@\w.]+)$/) { | |
121 | $data = $1; # $data now untainted | |
122 | } else { | |
123 | die "Bad data in $data"; # log this somewhere | |
124 | } | |
125 | ||
126 | This is fairly secure since C</\w+/> doesn't normally match shell | |
127 | metacharacters, nor are dot, dash, or at going to mean something special | |
128 | to the shell. Use of C</.+/> would have been insecure in theory because | |
129 | it lets everything through, but Perl doesn't check for that. The lesson | |
130 | is that when untainting, you must be exceedingly careful with your patterns. | |
131 | Laundering data using regular expression is the I<ONLY> mechanism for | |
132 | untainting dirty data, unless you use the strategy detailed below to fork | |
133 | a child of lesser privilege. | |
134 | ||
135 | =head2 Cleaning Up Your Path | |
136 | ||
1fef88e7 | 137 | For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to a |
425e5e39 | 138 | known value, and each directory in the path must be non-writable by others |
139 | than its owner and group. You may be surprised to get this message even | |
140 | if the pathname to your executable is fully qualified. This is I<not> | |
141 | generated because you didn't supply a full path to the program; instead, | |
142 | it's generated because you never set your PATH environment variable, or | |
143 | you didn't set it to something that was safe. Because Perl can't | |
144 | guarantee that the executable in question isn't itself going to turn | |
145 | around and execute some other program that is dependent on your PATH, it | |
146 | makes sure you set the PATH. | |
a0d0e21e LW |
147 | |
148 | It's also possible to get into trouble with other operations that don't | |
149 | care whether they use tainted values. Make judicious use of the file | |
150 | tests in dealing with any user-supplied filenames. When possible, do | |
151 | opens and such after setting C<$E<gt> = $E<lt>>. (Remember group IDs, | |
425e5e39 | 152 | too!) Perl doesn't prevent you from opening tainted filenames for reading, |
a0d0e21e LW |
153 | so be careful what you print out. The tainting mechanism is intended to |
154 | prevent stupid mistakes, not to remove the need for thought. | |
155 | ||
425e5e39 | 156 | Perl does not call the shell to expand wild cards when you pass B<system> |
157 | and B<exec> explicit parameter lists instead of strings with possible shell | |
158 | wildcards in them. Unfortunately, the B<open>, B<glob>, and | |
159 | backtick functions provide no such alternate calling convention, so more | |
160 | subterfuge will be required. | |
161 | ||
162 | Perl provides a reasonably safe way to open a file or pipe from a setuid | |
163 | or setgid program: just create a child process with reduced privilege who | |
164 | does the dirty work for you. First, fork a child using the special | |
165 | B<open> syntax that connects the parent and child by a pipe. Now the | |
166 | child resets its ID set and any other per-process attributes, like | |
167 | environment variables, umasks, current working directories, back to the | |
168 | originals or known safe values. Then the child process, which no longer | |
169 | has any special permissions, does the B<open> or other system call. | |
170 | Finally, the child passes the data it managed to access back to the | |
171 | parent. Since the file or pipe was opened in the child while running | |
172 | under less privilege than the parent, it's not apt to be tricked into | |
173 | doing something it shouldn't. | |
174 | ||
175 | Here's a way to do backticks reasonably safely. Notice how the B<exec> is | |
176 | not called with a string that the shell could expand. This is by far the | |
177 | best way to call something that might be subjected to shell escapes: just | |
178 | never call the shell at all. By the time we get to the B<exec>, tainting | |
179 | is turned off, however, so be careful what you call and what you pass it. | |
cb1a09d0 | 180 | |
425e5e39 | 181 | use English; |
cb1a09d0 AD |
182 | die unless defined $pid = open(KID, "-|"); |
183 | if ($pid) { # parent | |
184 | while (<KID>) { | |
185 | # do something | |
425e5e39 | 186 | } |
cb1a09d0 AD |
187 | close KID; |
188 | } else { | |
425e5e39 | 189 | $EUID = $UID; |
190 | $EGID = $GID; # XXX: initgroups() not called | |
191 | $ENV{PATH} = "/bin:/usr/bin"; | |
192 | exec 'myprog', 'arg1', 'arg2'; | |
193 | die "can't exec myprog: $!"; | |
194 | } | |
195 | ||
196 | A similar strategy would work for wildcard expansion via C<glob>. | |
197 | ||
198 | Taint checking is most useful when although you trust yourself not to have | |
199 | written a program to give away the farm, you don't necessarily trust those | |
200 | who end up using it not to try to trick it into doing something bad. This | |
201 | is the kind of security checking that's useful for setuid programs and | |
202 | programs launched on someone else's behalf, like CGI programs. | |
203 | ||
204 | This is quite different, however, from not even trusting the writer of the | |
205 | code not to try to do something evil. That's the kind of trust needed | |
206 | when someone hands you a program you've never seen before and says, "Here, | |
207 | run this." For that kind of safety, check out the Safe module, | |
208 | included standard in the Perl distribution. This module allows the | |
209 | programmer to set up special compartments in which all system operations | |
210 | are trapped and namespace access is carefully controlled. | |
211 | ||
212 | =head2 Security Bugs | |
213 | ||
214 | Beyond the obvious problems that stem from giving special privileges to | |
215 | systems as flexible as scripts, on many versions of Unix, setuid scripts | |
216 | are inherently insecure right from the start. The problem is a race | |
217 | condition in the kernel. Between the time the kernel opens the file to | |
218 | see which interpreter to run and when the (now-setuid) interpreter turns | |
219 | around and reopens the file to interpret it, the file in question may have | |
220 | changed, especially if you have symbolic links on your system. | |
221 | ||
222 | Fortunately, sometimes this kernel "feature" can be disabled. | |
223 | Unfortunately, there are two ways to disable it. The system can simply | |
224 | outlaw scripts with the setuid bit set, which doesn't help much. | |
225 | Alternately, it can simply ignore the setuid bit on scripts. If the | |
226 | latter is true, Perl can emulate the setuid and setgid mechanism when it | |
227 | notices the otherwise useless setuid/gid bits on Perl scripts. It does | |
228 | this via a special executable called B<suidperl> that is automatically | |
229 | invoked for you if it's needed. | |
230 | ||
231 | However, if the kernel setuid script feature isn't disabled, Perl will | |
232 | complain loudly that your setuid script is insecure. You'll need to | |
233 | either disable the kernel setuid script feature, or put a C wrapper around | |
234 | the script. A C wrapper is just a compiled program that does nothing | |
235 | except call your Perl program. Compiled programs are not subject to the | |
236 | kernel bug that plagues setuid scripts. Here's a simple wrapper, written | |
237 | in C: | |
238 | ||
239 | #define REAL_PATH "/path/to/script" | |
240 | main(ac, av) | |
241 | char **av; | |
242 | { | |
243 | execv(REAL_PATH, av); | |
cb1a09d0 AD |
244 | } |
245 | ||
425e5e39 | 246 | Compile this wrapper into a binary executable and then make I<it> rather |
247 | than your script setuid or setgid. | |
248 | ||
249 | See the program B<wrapsuid> in the F<eg> directory of your Perl | |
250 | distribution for a convenient way to do this automatically for all your | |
251 | setuid Perl programs. It moves setuid scripts into files with the same | |
252 | name plus a leading dot, and then compiles a wrapper like the one above | |
253 | for each of them. | |
254 | ||
255 | In recent years, vendors have begun to supply systems free of this | |
256 | inherent security bug. On such systems, when the kernel passes the name | |
257 | of the setuid script to open to the interpreter, rather than using a | |
258 | pathname subject to meddling, it instead passes I</dev/fd/3>. This is a | |
259 | special file already opened on the script, so that there can be no race | |
260 | condition for evil scripts to exploit. On these systems, Perl should be | |
261 | compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The B<Configure> | |
262 | program that builds Perl tries to figure this out for itself, so you | |
263 | should never have to specify this yourself. Most modern releases of | |
264 | SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition. | |
265 | ||
266 | Prior to release 5.003 of Perl, a bug in the code of B<suidperl> could | |
267 | introduce a security hole in systems compiled with strict POSIX | |
268 | compliance. |