=back
-The value of an expression containing tainted data will itself be
-tainted, even if it is logically impossible for the tainted data to
-affect the value.
+For efficiency reasons, Perl takes a conservative view of
+whether data is tainted. If an expression contains tainted data,
+any subexpression may be considered tainted, even if the value
+of the subexpression is not itself affected by the tainted data.
Because taintedness is associated with each scalar value, some
-elements of an array can be tainted and others not.
+elements of an array or hash can be tainted and others not.
+The keys of a hash are never tainted.
For example:
thus trigger an "Insecure dependency" message, you can use the
tainted() function of the Scalar::Util module, available in your
nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
-Or you may be able to use the following I<is_tainted()> function.
+Or you may be able to use the following C<is_tainted()> function.
sub is_tainted {
return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
same expression, the whole expression is considered tainted.
But testing for taintedness gets you only so far. Sometimes you have just
-to clear your data's taintedness. The only way to bypass the tainting
+to clear your data's taintedness. Values may be untainted by using them
+as keys in a hash; otherwise the only way to bypass the tainting
mechanism is by referencing subpatterns from a regular expression match.
Perl presumes that if you reference a substring using $1, $2, etc., that
you knew what you were doing when you wrote the pattern. That means using
if ($data =~ /^([-\@\w.]+)$/) {
$data = $1; # $data now untainted
} else {
- die "Bad data in $data"; # log this somewhere
+ die "Bad data in '$data'"; # log this somewhere
}
This is fairly secure because C</\w+/> doesn't normally match shell
under such systems. (This issue should arise only in Unix or
Unix-like environments that support #! and setuid or setgid scripts.)
+=head2 Taint mode and @INC
+
+When the taint mode (C<-T>) is in effect, the "." directory is removed
+from C<@INC>, and the environment variables C<PERL5LIB> and C<PERLLIB>
+are ignored by Perl. You can still adjust C<@INC> from outside the
+program by using the C<-I> command line option as explained in
+L<perlrun>. The two environment variables are ignored because
+they are obscured, and a user running a program could be unaware that
+they are set, whereas the C<-I> option is clearly visible and
+therefore permitted.
+
+Another way to modify C<@INC> without modifying the program, is to use
+the C<lib> pragma, e.g.:
+
+ perl -Mlib=/foo program
+
+The benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
+will automagically remove any duplicated directories, while the later
+will not.
+
=head2 Cleaning Up Your Path
For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to a
L<perlunicode> for details, and L<perlunicode/"Security Implications
of Unicode"> for security implications in particular.
+=head2 Algorithmic Complexity Attacks
+
+Certain internal algorithms used in the implementation of Perl can
+be attacked by choosing the input carefully to consume large amounts
+of either time or space or both. This can lead into the so-called
+I<Denial of Service> (DoS) attacks.
+
+=over 4
+
+=item *
+
+Hash Function - the algorithm used to "order" hash elements has been
+changed several times during the development of Perl, mainly to be
+reasonably fast. In Perl 5.8.1 also the security aspect was taken
+into account.
+
+In Perls before 5.8.1 one could rather easily generate data that as
+hash keys would cause Perl to consume large amounts of time because
+internal structure of hashes would badly degenerate. In Perl 5.8.1
+the hash function is randomly perturbed by a pseudorandom seed which
+makes generating such naughty hash keys harder.
+See L<perlrun/PERL_HASH_SEED> for more information.
+
+The random perturbation is done by default but if one wants for some
+reason emulate the old behaviour one can set the environment variable
+PERL_HASH_SEED to zero (or any other integer). One possible reason
+for wanting to emulate the old behaviour is that in the new behaviour
+consecutive runs of Perl will order hash keys differently, which may
+confuse some applications (like Data::Dumper: the outputs of two
+different runs are no more identical).
+
+B<Perl has never guaranteed any ordering of the hash keys>, and the
+ordering has already changed several times during the lifetime of
+Perl 5. Also, the ordering of hash keys has always been, and
+continues to be, affected by the insertion order.
+
+Also note that while the order of the hash elements might be
+randomised, this "pseudoordering" should B<not> be used for
+applications like shuffling a list randomly (use List::Util::shuffle()
+for that, see L<List::Util>, a standard core module since Perl 5.8.0;
+or the CPAN module Algorithm::Numerical::Shuffle), or for generating
+permutations (use e.g. the CPAN modules Algorithm::Permute or
+Algorithm::FastPermute), or for any cryptographic applications.
+
+=item *
+
+Regular expressions - Perl's regular expression engine is so called
+NFA (Non-Finite Automaton), which among other things means that it can
+rather easily consume large amounts of both time and space if the
+regular expression may match in several ways. Careful crafting of the
+regular expressions can help but quite often there really isn't much
+one can do (the book "Mastering Regular Expressions" is required
+reading, see L<perlfaq2>). Running out of space manifests itself by
+Perl running out of memory.
+
+=item *
+
+Sorting - the quicksort algorithm used in Perls before 5.8.0 to
+implement the sort() function is very easy to trick into misbehaving
+so that it consumes a lot of time. Nothing more is required than
+resorting a list already sorted. Starting from Perl 5.8.0 a different
+sorting algorithm, mergesort, is used. Mergesort is insensitive to
+its input data, so it cannot be similarly fooled.
+
+=back
+
+See L<http://www.cs.rice.edu/~scrosby/hash/> for more information,
+and any computer science text book on the algorithmic complexity.
+
=head1 SEE ALSO
L<perlrun> for its description of cleaning up environment variables.