This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
File::Find pod: fix link issue
[perl5.git] / lib / File / Find.pm
CommitLineData
a0d0e21e 1package File::Find;
3b825e41 2use 5.006;
b75c8c73 3use strict;
b395063c 4use warnings;
cd68ec93 5use warnings::register;
f4d05d5f 6our $VERSION = '1.20';
a0d0e21e 7require Exporter;
6280b799 8require Cwd;
a0d0e21e 9
7bd31527 10#
98dc9551 11# Modified to ensure sub-directory traversal order is not inverted by stack
7bd31527
JH
12# push and pops. That is remains in the same order as in the directory file,
13# or user pre-processing (EG:sorted).
14#
15
f06db76b
AD
16=head1 NAME
17
abfdd623 18File::Find - Traverse a directory tree.
f06db76b
AD
19
20=head1 SYNOPSIS
21
22 use File::Find;
6d355c6e 23 find(\&wanted, @directories_to_search);
f06db76b 24 sub wanted { ... }
237437d0 25
f06db76b 26 use File::Find;
abfdd623 27 finddepth(\&wanted, @directories_to_search);
f06db76b 28 sub wanted { ... }
3cb6de81 29
81793b90
GS
30 use File::Find;
31 find({ wanted => \&process, follow => 1 }, '.');
f06db76b
AD
32
33=head1 DESCRIPTION
34
abfdd623
MS
35These are functions for searching through directory trees doing work
36on each file found similar to the Unix I<find> command. File::Find
37exports two functions, C<find> and C<finddepth>. They work similarly
38but have subtle differences.
39
40=over 4
41
42=item B<find>
43
44 find(\&wanted, @directories);
45 find(\%options, @directories);
46
6eb87ff8
MJD
47C<find()> does a depth-first search over the given C<@directories> in
48the order they are given. For each file or directory found, it calls
49the C<&wanted> subroutine. (See below for details on how to use the
50C<&wanted> function). Additionally, for each directory found, it will
51C<chdir()> into that directory and continue the search, invoking the
52C<&wanted> function on each file or subdirectory in the directory.
abfdd623
MS
53
54=item B<finddepth>
55
56 finddepth(\&wanted, @directories);
57 finddepth(\%options, @directories);
58
b807ffce 59C<finddepth()> works just like C<find()> except that it invokes the
6eb87ff8
MJD
60C<&wanted> function for a directory I<after> invoking it for the
61directory's contents. It does a postorder traversal instead of a
62preorder traversal, working from the bottom of the directory tree up
63where C<find()> works from the top of the tree down.
abfdd623
MS
64
65=back
66
67=head2 %options
68
95e23d19
AL
69The first argument to C<find()> is either a code reference to your
70C<&wanted> function, or a hash reference describing the operations
71to be performed for each file. The
abfdd623 72code reference is described in L<The wanted function> below.
20408e3c 73
81793b90
GS
74Here are the possible keys for the hash:
75
76=over 3
77
78=item C<wanted>
79
abfdd623 80The value should be a code reference. This code reference is
4c90698d
RB
81described in L<The wanted function> below. The C<&wanted> subroutine is
82mandatory.
81793b90
GS
83
84=item C<bydepth>
85
86Reports the name of a directory only AFTER all its entries
95e23d19 87have been reported. Entry point C<finddepth()> is a shortcut for
788c0c32 88specifying C<< { bydepth => 1 } >> in the first argument of C<find()>.
81793b90 89
719c805e
JS
90=item C<preprocess>
91
3555aed3
SP
92The value should be a code reference. This code reference is used to
93preprocess the current directory. The name of the currently processed
95e23d19
AL
94directory is in C<$File::Find::dir>. Your preprocessing function is
95called after C<readdir()>, but before the loop that calls the C<wanted()>
3555aed3
SP
96function. It is called with a list of strings (actually file/directory
97names) and is expected to return a list of strings. The code can be
98used to sort the file/directory names alphabetically, numerically,
99or to filter out directory entries based on their name alone. When
7e47e6ff 100I<follow> or I<follow_fast> are in effect, C<preprocess> is a no-op.
719c805e
JS
101
102=item C<postprocess>
103
3555aed3
SP
104The value should be a code reference. It is invoked just before leaving
105the currently processed directory. It is called in void context with no
106arguments. The name of the current directory is in C<$File::Find::dir>. This
107hook is handy for summarizing a directory, such as calculating its disk
108usage. When I<follow> or I<follow_fast> are in effect, C<postprocess> is a
7e47e6ff 109no-op.
719c805e 110
81793b90
GS
111=item C<follow>
112
113Causes symbolic links to be followed. Since directory trees with symbolic
114links (followed) may contain files more than once and may even have
115cycles, a hash has to be built up with an entry for each file.
116This might be expensive both in space and time for a large
f4d05d5f 117directory tree. See L</follow_fast> and L</follow_skip> below.
81793b90
GS
118If either I<follow> or I<follow_fast> is in effect:
119
120=over 6
121
a45bd81d 122=item *
81793b90 123
54bd407c
RGS
124It is guaranteed that an I<lstat> has been called before the user's
125C<wanted()> function is called. This enables fast file checks involving S<_>.
126Note that this guarantee no longer holds if I<follow> or I<follow_fast>
127are not set.
81793b90 128
a45bd81d 129=item *
81793b90
GS
130
131There is a variable C<$File::Find::fullname> which holds the absolute
615a2b9b
SP
132pathname of the file with all symbolic links resolved. If the link is
133a dangling symbolic link, then fullname will be set to C<undef>.
81793b90
GS
134
135=back
136
204b4d7f
SP
137This is a no-op on Win32.
138
81793b90
GS
139=item C<follow_fast>
140
f10e1564
RM
141This is similar to I<follow> except that it may report some files more
142than once. It does detect cycles, however. Since only symbolic links
143have to be hashed, this is much cheaper both in space and time. If
95e23d19 144processing a file more than once (by the user's C<wanted()> function)
81793b90
GS
145is worse than just taking time, the option I<follow> should be used.
146
204b4d7f
SP
147This is also a no-op on Win32.
148
81793b90
GS
149=item C<follow_skip>
150
151C<follow_skip==1>, which is the default, causes all files which are
152neither directories nor symbolic links to be ignored if they are about
3555aed3 153to be processed a second time. If a directory or a symbolic link
81793b90 154are about to be processed a second time, File::Find dies.
95e23d19 155
81793b90
GS
156C<follow_skip==0> causes File::Find to die if any file is about to be
157processed a second time.
95e23d19 158
81793b90 159C<follow_skip==2> causes File::Find to ignore any duplicate files and
7e47e6ff 160directories but to proceed normally otherwise.
20408e3c 161
80e52b73
JH
162=item C<dangling_symlinks>
163
164If true and a code reference, will be called with the symbolic link
165name and the directory it lives in as arguments. Otherwise, if true
166and warnings are on, warning "symbolic_link_name is a dangling
167symbolic link\n" will be issued. If false, the dangling symbolic link
168will be silently ignored.
f06db76b 169
81793b90
GS
170=item C<no_chdir>
171
95e23d19 172Does not C<chdir()> to each directory as it recurses. The C<wanted()>
81793b90
GS
173function will need to be aware of this, of course. In this case,
174C<$_> will be the same as C<$File::Find::name>.
175
176=item C<untaint>
177
178If find is used in taint-mode (-T command line switch or if EUID != UID
179or if EGID != GID) then internally directory names have to be untainted
7e47e6ff 180before they can be chdir'ed to. Therefore they are checked against a regular
3555aed3
SP
181expression I<untaint_pattern>. Note that all names passed to the user's
182I<wanted()> function are still tainted. If this option is used while
7e47e6ff 183not in taint-mode, C<untaint> is a no-op.
81793b90
GS
184
185=item C<untaint_pattern>
186
187See above. This should be set using the C<qr> quoting operator.
3555aed3 188The default is set to C<qr|^([-+@\w./]+)$|>.
1cffc1dd 189Note that the parentheses are vital.
81793b90
GS
190
191=item C<untaint_skip>
192
3555aed3 193If set, a directory which fails the I<untaint_pattern> is skipped,
7e47e6ff 194including all its sub-directories. The default is to 'die' in such a case.
81793b90
GS
195
196=back
197
abfdd623
MS
198=head2 The wanted function
199
95e23d19
AL
200The C<wanted()> function does whatever verifications you want on
201each file and directory. Note that despite its name, the C<wanted()>
202function is a generic callback function, and does B<not> tell
203File::Find if a file is "wanted" or not. In fact, its return value
204is ignored.
205
206The wanted function takes no arguments but rather does its work
abfdd623
MS
207through a collection of variables.
208
209=over 4
210
f837ebe2 211=item C<$File::Find::dir> is the current directory name,
abfdd623
MS
212
213=item C<$_> is the current filename within that directory
214
f837ebe2 215=item C<$File::Find::name> is the complete pathname to the file.
abfdd623
MS
216
217=back
218
4c767a4e 219The above variables have all been localized and may be changed without
026810a8 220affecting data outside of the wanted function.
4c767a4e 221
95e23d19 222For example, when examining the file F</some/path/foo.ext> you will have:
abfdd623
MS
223
224 $File::Find::dir = /some/path/
225 $_ = foo.ext
226 $File::Find::name = /some/path/foo.ext
227
3555aed3 228You are chdir()'d to C<$File::Find::dir> when the function is called,
abfdd623
MS
229unless C<no_chdir> was specified. Note that when changing to
230directories is in effect the root directory (F</>) is a somewhat
231special case inasmuch as the concatenation of C<$File::Find::dir>,
232C<'/'> and C<$_> is not literally equal to C<$File::Find::name>. The
233table below summarizes all variants:
5cf0a2f2
WL
234
235 $File::Find::name $File::Find::dir $_
236 default / / .
237 no_chdir=>0 /etc / etc
238 /etc/x /etc x
abfdd623 239
5cf0a2f2
WL
240 no_chdir=>1 / / /
241 /etc / /etc
242 /etc/x /etc /etc/x
243
244
788c0c32 245When C<follow> or C<follow_fast> are in effect, there is
f10e1564
RM
246also a C<$File::Find::fullname>. The function may set
247C<$File::Find::prune> to prune the tree unless C<bydepth> was
248specified. Unless C<follow> or C<follow_fast> is specified, for
249compatibility reasons (find.pl, find2perl) there are in addition the
250following globals available: C<$File::Find::topdir>,
251C<$File::Find::topdev>, C<$File::Find::topino>,
e7b91b67 252C<$File::Find::topmode> and C<$File::Find::topnlink>.
47a735e8 253
20408e3c 254This library is useful for the C<find2perl> tool, which when fed,
f06db76b
AD
255
256 find2perl / -name .nfs\* -mtime +7 \
81793b90 257 -exec rm -f {} \; -o -fstype nfs -prune
f06db76b
AD
258
259produces something like:
260
261 sub wanted {
c7b9dd21 262 /^\.nfs.*\z/s &&
81793b90 263 (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
f06db76b
AD
264 int(-M _) > 7 &&
265 unlink($_)
266 ||
81793b90 267 ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
f06db76b 268 $dev < 0 &&
6280b799 269 ($File::Find::prune = 1);
f06db76b
AD
270 }
271
43dece2a
JH
272Notice the C<_> in the above C<int(-M _)>: the C<_> is a magical
273filehandle that caches the information from the preceding
95e23d19 274C<stat()>, C<lstat()>, or filetest.
43dece2a 275
1cffc1dd
JH
276Here's another interesting wanted function. It will find all symbolic
277links that don't resolve:
f06db76b
AD
278
279 sub wanted {
81793b90 280 -l && !-e && print "bogus link: $File::Find::name\n";
237437d0 281 }
f06db76b 282
81793b90
GS
283See also the script C<pfind> on CPAN for a nice application of this
284module.
285
cd68ec93
RGS
286=head1 WARNINGS
287
288If you run your program with the C<-w> switch, or if you use the
289C<warnings> pragma, File::Find will report warnings for several weird
290situations. You can disable these warnings by putting the statement
291
292 no warnings 'File::Find';
293
294in the appropriate scope. See L<perllexwarn> for more info about lexical
295warnings.
296
81793b90
GS
297=head1 CAVEAT
298
5fa2bf2b
DD
299=over 2
300
301=item $dont_use_nlink
302
303You can set the variable C<$File::Find::dont_use_nlink> to 1, if you want to
6cf3b067
T
304force File::Find to always stat directories. This was used for file systems
305that do not have an C<nlink> count matching the number of sub-directories.
306Examples are ISO-9660 (CD-ROM), AFS, HPFS (OS/2 file system), FAT (DOS file
307system) and a couple of others.
5fa2bf2b 308
6cf3b067
T
309You shouldn't need to set this variable, since File::Find should now detect
310such file systems on-the-fly and switch itself to using stat. This works even
311for parts of your file system, like a mounted CD-ROM.
5fa2bf2b 312
6cf3b067 313If you do set C<$File::Find::dont_use_nlink> to 1, you will notice slow-downs.
5fa2bf2b
DD
314
315=item symlinks
316
f10e1564 317Be aware that the option to follow symbolic links can be dangerous.
81793b90
GS
318Depending on the structure of the directory tree (including symbolic
319links to directories) you might traverse a given (physical) directory
3555aed3 320more than once (only if C<follow_fast> is in effect).
81793b90
GS
321Furthermore, deleting or changing files in a symbolically linked directory
322might cause very unpleasant surprises, since you delete or change files
323in an unknown directory.
0530a6c4 324
5fa2bf2b
DD
325=back
326
6eb87ff8
MJD
327=head1 BUGS AND CAVEATS
328
329Despite the name of the C<finddepth()> function, both C<find()> and
330C<finddepth()> perform a depth-first search of the directory
331hierarchy.
332
a85af077
A
333=head1 HISTORY
334
335File::Find used to produce incorrect results if called recursively.
336During the development of perl 5.8 this bug was fixed.
337The first fixed version of File::Find was 1.01.
338
9b33fb8e
DJ
339=head1 SEE ALSO
340
341find, find2perl.
342
f06db76b
AD
343=cut
344
b75c8c73
MS
345our @ISA = qw(Exporter);
346our @EXPORT = qw(find finddepth);
6280b799 347
a0d0e21e 348
81793b90
GS
349use strict;
350my $Is_VMS;
a0b245d5 351my $Is_Win32;
81793b90
GS
352
353require File::Basename;
7e47e6ff 354require File::Spec;
81793b90 355
9f826d6a
BM
356# Should ideally be my() not our() but local() currently
357# refuses to operate on lexicals
358
359our %SLnkSeen;
360our ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
719c805e 361 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 362 $pre_process, $post_process, $dangling_symlinks);
81793b90
GS
363
364sub contract_name {
365 my ($cdir,$fn) = @_;
366
7e47e6ff 367 return substr($cdir,0,rindex($cdir,'/')) if $fn eq $File::Find::current_dir;
81793b90
GS
368
369 $cdir = substr($cdir,0,rindex($cdir,'/')+1);
370
371 $fn =~ s|^\./||;
372
373 my $abs_name= $cdir . $fn;
374
375 if (substr($fn,0,3) eq '../') {
51393fc0 376 1 while $abs_name =~ s!/[^/]*/\.\./+!/!;
81793b90
GS
377 }
378
379 return $abs_name;
380}
381
81793b90
GS
382sub PathCombine($$) {
383 my ($Base,$Name) = @_;
384 my $AbsName;
385
862f843b
NC
386 if (substr($Name,0,1) eq '/') {
387 $AbsName= $Name;
81793b90
GS
388 }
389 else {
862f843b
NC
390 $AbsName= contract_name($Base,$Name);
391 }
81793b90 392
862f843b
NC
393 # (simple) check for recursion
394 my $newlen= length($AbsName);
395 if ($newlen <= length($Base)) {
396 if (($newlen == length($Base) || substr($Base,$newlen,1) eq '/')
397 && $AbsName eq substr($Base,0,$newlen))
398 {
399 return undef;
81793b90
GS
400 }
401 }
402 return $AbsName;
403}
404
405sub Follow_SymLink($) {
406 my ($AbsName) = @_;
407
408 my ($NewName,$DEV, $INO);
409 ($DEV, $INO)= lstat $AbsName;
410
411 while (-l _) {
412 if ($SLnkSeen{$DEV, $INO}++) {
413 if ($follow_skip < 2) {
414 die "$AbsName is encountered a second time";
a0d0e21e
LW
415 }
416 else {
81793b90 417 return undef;
a0d0e21e
LW
418 }
419 }
81793b90
GS
420 $NewName= PathCombine($AbsName, readlink($AbsName));
421 unless(defined $NewName) {
422 if ($follow_skip < 2) {
423 die "$AbsName is a recursive symbolic link";
424 }
425 else {
426 return undef;
a0d0e21e 427 }
81793b90
GS
428 }
429 else {
430 $AbsName= $NewName;
431 }
432 ($DEV, $INO) = lstat($AbsName);
433 return undef unless defined $DEV; # dangling symbolic link
434 }
435
cd68ec93 436 if ($full_check && defined $DEV && $SLnkSeen{$DEV, $INO}++) {
7e47e6ff 437 if ( ($follow_skip < 1) || ((-d _) && ($follow_skip < 2)) ) {
81793b90
GS
438 die "$AbsName encountered a second time";
439 }
440 else {
441 return undef;
442 }
443 }
444
445 return $AbsName;
446}
447
17f410f9 448our($dir, $name, $fullname, $prune);
81793b90
GS
449sub _find_dir_symlnk($$$);
450sub _find_dir($$$);
451
7e47e6ff
JH
452# check whether or not a scalar variable is tainted
453# (code straight from the Camel, 3rd ed., page 561)
454sub is_tainted_pp {
455 my $arg = shift;
456 my $nada = substr($arg, 0, 0); # zero-length
457 local $@;
458 eval { eval "# $nada" };
459 return length($@) != 0;
3555aed3 460}
7e47e6ff 461
81793b90
GS
462sub _find_opt {
463 my $wanted = shift;
464 die "invalid top directory" unless defined $_[0];
465
9f826d6a
BM
466 # This function must local()ize everything because callbacks may
467 # call find() or finddepth()
468
469 local %SLnkSeen;
470 local ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
471 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 472 $pre_process, $post_process, $dangling_symlinks);
4c621faf 473 local($dir, $name, $fullname, $prune);
bc125c03 474 local *_ = \my $a;
9f826d6a 475
a0c9c202 476 my $cwd = $wanted->{bydepth} ? Cwd::fastcwd() : Cwd::getcwd();
a1ccf0c4
JM
477 if ($Is_VMS) {
478 # VMS returns this by default in VMS format which just doesn't
479 # work for the rest of this module.
480 $cwd = VMS::Filespec::unixpath($cwd);
481
482 # Apparently this is not expected to have a trailing space.
483 # To attempt to make VMS/UNIX conversions mostly reversable,
484 # a trailing slash is needed. The run-time functions ignore the
485 # resulting double slash, but it causes the perl tests to fail.
486 $cwd =~ s#/\z##;
487
488 # This comes up in upper case now, but should be lower.
489 # In the future this could be exact case, no need to change.
490 }
80e52b73
JH
491 my $cwd_untainted = $cwd;
492 my $check_t_cwd = 1;
493 $wanted_callback = $wanted->{wanted};
494 $bydepth = $wanted->{bydepth};
495 $pre_process = $wanted->{preprocess};
496 $post_process = $wanted->{postprocess};
497 $no_chdir = $wanted->{no_chdir};
a0b245d5
AD
498 $full_check = $Is_Win32 ? 0 : $wanted->{follow};
499 $follow = $Is_Win32 ? 0 :
1bb17459 500 $full_check || $wanted->{follow_fast};
80e52b73
JH
501 $follow_skip = $wanted->{follow_skip};
502 $untaint = $wanted->{untaint};
503 $untaint_pat = $wanted->{untaint_pattern};
504 $untaint_skip = $wanted->{untaint_skip};
505 $dangling_symlinks = $wanted->{dangling_symlinks};
81793b90 506
1cffc1dd 507 # for compatibility reasons (find.pl, find2perl)
9f826d6a 508 local our ($topdir, $topdev, $topino, $topmode, $topnlink);
81793b90
GS
509
510 # a symbolic link to a directory doesn't increase the link count
511 $avoid_nlink = $follow || $File::Find::dont_use_nlink;
3555aed3 512
e7b91b67 513 my ($abs_dir, $Is_Dir);
81793b90
GS
514
515 Proc_Top_Item:
4c621faf 516 foreach my $TOP (@_) {
7e47e6ff
JH
517 my $top_item = $TOP;
518
3555aed3
SP
519 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
520
a0b245d5
AD
521 if ($Is_Win32) {
522 $top_item =~ s|[/\\]\z||
523 unless $top_item =~ m{^(?:\w:)?[/\\]$};
7e47e6ff
JH
524 }
525 else {
526 $top_item =~ s|/\z|| unless $top_item eq '/';
7e47e6ff
JH
527 }
528
529 $Is_Dir= 0;
530
531 if ($follow) {
532
862f843b
NC
533 if (substr($top_item,0,1) eq '/') {
534 $abs_dir = $top_item;
7e47e6ff 535 }
862f843b
NC
536 elsif ($top_item eq $File::Find::current_dir) {
537 $abs_dir = $cwd;
538 }
539 else { # care about any ../
540 $top_item =~ s/\.dir\z//i if $Is_VMS;
541 $abs_dir = contract_name("$cwd/",$top_item);
7e47e6ff
JH
542 }
543 $abs_dir= Follow_SymLink($abs_dir);
544 unless (defined $abs_dir) {
80e52b73
JH
545 if ($dangling_symlinks) {
546 if (ref $dangling_symlinks eq 'CODE') {
547 $dangling_symlinks->($top_item, $cwd);
548 } else {
cd68ec93 549 warnings::warnif "$top_item is a dangling symbolic link\n";
80e52b73
JH
550 }
551 }
81793b90 552 next Proc_Top_Item;
7e47e6ff
JH
553 }
554
555 if (-d _) {
a1ccf0c4 556 $top_item =~ s/\.dir\z//i if $Is_VMS;
81793b90
GS
557 _find_dir_symlnk($wanted, $abs_dir, $top_item);
558 $Is_Dir= 1;
7e47e6ff
JH
559 }
560 }
81793b90 561 else { # no follow
7e47e6ff
JH
562 $topdir = $top_item;
563 unless (defined $topnlink) {
cd68ec93 564 warnings::warnif "Can't stat $top_item: $!\n";
7e47e6ff
JH
565 next Proc_Top_Item;
566 }
567 if (-d _) {
544ff7a7 568 $top_item =~ s/\.dir\z//i if $Is_VMS;
e7b91b67 569 _find_dir($wanted, $top_item, $topnlink);
81793b90 570 $Is_Dir= 1;
7e47e6ff 571 }
237437d0 572 else {
81793b90 573 $abs_dir= $top_item;
7e47e6ff
JH
574 }
575 }
81793b90 576
7e47e6ff 577 unless ($Is_Dir) {
81793b90 578 unless (($_,$dir) = File::Basename::fileparse($abs_dir)) {
862f843b 579 ($dir,$_) = ('./', $top_item);
81793b90
GS
580 }
581
7e47e6ff
JH
582 $abs_dir = $dir;
583 if (( $untaint ) && (is_tainted($dir) )) {
584 ( $abs_dir ) = $dir =~ m|$untaint_pat|;
81793b90
GS
585 unless (defined $abs_dir) {
586 if ($untaint_skip == 0) {
7e47e6ff 587 die "directory $dir is still tainted";
81793b90
GS
588 }
589 else {
590 next Proc_Top_Item;
591 }
592 }
7e47e6ff 593 }
81793b90 594
7e47e6ff 595 unless ($no_chdir || chdir $abs_dir) {
cd68ec93 596 warnings::warnif "Couldn't chdir $abs_dir: $!\n";
7e47e6ff
JH
597 next Proc_Top_Item;
598 }
719911cc 599
7e47e6ff 600 $name = $abs_dir . $_; # $File::Find::name
3bb6d3e5 601 $_ = $name if $no_chdir;
719911cc 602
abfdd623 603 { $wanted_callback->() }; # protect against wild "next"
81793b90 604
7e47e6ff 605 }
81793b90 606
7e47e6ff
JH
607 unless ( $no_chdir ) {
608 if ( ($check_t_cwd) && (($untaint) && (is_tainted($cwd) )) ) {
609 ( $cwd_untainted ) = $cwd =~ m|$untaint_pat|;
610 unless (defined $cwd_untainted) {
611 die "insecure cwd in find(depth)";
612 }
613 $check_t_cwd = 0;
614 }
615 unless (chdir $cwd_untainted) {
616 die "Can't cd to $cwd: $!\n";
617 }
618 }
81793b90
GS
619 }
620}
621
622# API:
623# $wanted
624# $p_dir : "parent directory"
625# $nlink : what came back from the stat
626# preconditions:
627# chdir (if not no_chdir) to dir
628
629sub _find_dir($$$) {
630 my ($wanted, $p_dir, $nlink) = @_;
631 my ($CdLvl,$Level) = (0,0);
632 my @Stack;
633 my @filenames;
634 my ($subcount,$sub_nlink);
635 my $SE= [];
636 my $dir_name= $p_dir;
7e47e6ff 637 my $dir_pref;
39e79f6b 638 my $dir_rel = $File::Find::current_dir;
7e47e6ff 639 my $tainted = 0;
5fa2bf2b 640 my $no_nlink;
7e47e6ff 641
a0b245d5
AD
642 if ($Is_Win32) {
643 $dir_pref
644 = ($p_dir =~ m{^(?:\w:[/\\]?|[/\\])$} ? $p_dir : "$p_dir/" );
645 } elsif ($Is_VMS) {
a1ccf0c4
JM
646
647 # VMS is returning trailing .dir on directories
648 # and trailing . on files and symbolic links
649 # in UNIX syntax.
650 #
651
652 $p_dir =~ s/\.(dir)?$//i unless $p_dir eq '.';
653
1e9c9d75 654 $dir_pref = ($p_dir =~ m/[\]>]+$/ ? $p_dir : "$p_dir/" );
7e47e6ff
JH
655 }
656 else {
657 $dir_pref= ( $p_dir eq '/' ? '/' : "$p_dir/" );
7e47e6ff 658 }
81793b90
GS
659
660 local ($dir, $name, $prune, *DIR);
7e47e6ff
JH
661
662 unless ( $no_chdir || ($p_dir eq $File::Find::current_dir)) {
81793b90 663 my $udir = $p_dir;
7e47e6ff
JH
664 if (( $untaint ) && (is_tainted($p_dir) )) {
665 ( $udir ) = $p_dir =~ m|$untaint_pat|;
81793b90
GS
666 unless (defined $udir) {
667 if ($untaint_skip == 0) {
668 die "directory $p_dir is still tainted";
669 }
670 else {
671 return;
672 }
237437d0 673 }
a0d0e21e 674 }
8d8eebbf 675 unless (chdir ($Is_VMS && $udir !~ /[\/\[<]+/ ? "./$udir" : $udir)) {
cd68ec93 676 warnings::warnif "Can't cd to $udir: $!\n";
81793b90
GS
677 return;
678 }
679 }
7e47e6ff
JH
680
681 # push the starting directory
57e73c4b 682 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
81793b90
GS
683
684 while (defined $SE) {
685 unless ($bydepth) {
3555aed3
SP
686 $dir= $p_dir; # $File::Find::dir
687 $name= $dir_name; # $File::Find::name
7e47e6ff 688 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
81793b90 689 # prune may happen here
7e47e6ff 690 $prune= 0;
abfdd623 691 { $wanted_callback->() }; # protect against wild "next"
7e47e6ff 692 next if $prune;
81793b90 693 }
7e47e6ff 694
81793b90 695 # change to that directory
7e47e6ff 696 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
81793b90 697 my $udir= $dir_rel;
7e47e6ff
JH
698 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_rel) )) ) {
699 ( $udir ) = $dir_rel =~ m|$untaint_pat|;
81793b90
GS
700 unless (defined $udir) {
701 if ($untaint_skip == 0) {
862f843b 702 die "directory (" . ($p_dir ne '/' ? $p_dir : '') . "/) $dir_rel is still tainted";
7e47e6ff 703 } else { # $untaint_skip == 1
3555aed3 704 next;
81793b90
GS
705 }
706 }
707 }
8d8eebbf 708 unless (chdir ($Is_VMS && $udir !~ /[\/\[<]+/ ? "./$udir" : $udir)) {
862f843b
NC
709 warnings::warnif "Can't cd to (" .
710 ($p_dir ne '/' ? $p_dir : '') . "/) $udir: $!\n";
81793b90
GS
711 next;
712 }
713 $CdLvl++;
714 }
715
3555aed3 716 $dir= $dir_name; # $File::Find::dir
81793b90
GS
717
718 # Get the list of files in the current directory.
7e47e6ff 719 unless (opendir DIR, ($no_chdir ? $dir_name : $File::Find::current_dir)) {
cd68ec93 720 warnings::warnif "Can't opendir($dir_name): $!\n";
81793b90
GS
721 next;
722 }
723 @filenames = readdir DIR;
724 closedir(DIR);
abfdd623 725 @filenames = $pre_process->(@filenames) if $pre_process;
719c805e 726 push @Stack,[$CdLvl,$dir_name,"",-2] if $post_process;
81793b90 727
98dc9551 728 # default: use whatever was specified
5fa2bf2b
DD
729 # (if $nlink >= 2, and $avoid_nlink == 0, this will switch back)
730 $no_nlink = $avoid_nlink;
731 # if dir has wrong nlink count, force switch to slower stat method
732 $no_nlink = 1 if ($nlink < 2);
733
734 if ($nlink == 2 && !$no_nlink) {
81793b90
GS
735 # This dir has no subdirectories.
736 for my $FN (@filenames) {
a1ccf0c4
JM
737 if ($Is_VMS) {
738 # Big hammer here - Compensate for VMS trailing . and .dir
739 # No win situation until this is changed, but this
740 # will handle the majority of the cases with breaking the fewest
741
742 $FN =~ s/\.dir\z//i;
743 $FN =~ s#\.$## if ($FN ne '.');
744 }
7e47e6ff 745 next if $FN =~ $File::Find::skip_pattern;
81793b90 746
7e47e6ff
JH
747 $name = $dir_pref . $FN; # $File::Find::name
748 $_ = ($no_chdir ? $name : $FN); # $_
abfdd623 749 { $wanted_callback->() }; # protect against wild "next"
81793b90
GS
750 }
751
752 }
753 else {
754 # This dir has subdirectories.
755 $subcount = $nlink - 2;
756
7bd31527
JH
757 # HACK: insert directories at this position. so as to preserve
758 # the user pre-processed ordering of files.
759 # EG: directory traversal is in user sorted order, not at random.
760 my $stack_top = @Stack;
761
81793b90 762 for my $FN (@filenames) {
7e47e6ff 763 next if $FN =~ $File::Find::skip_pattern;
5fa2bf2b 764 if ($subcount > 0 || $no_nlink) {
81793b90
GS
765 # Seen all the subdirs?
766 # check for directoriness.
767 # stat is faster for a file in the current directory
07867069 768 $sub_nlink = (lstat ($no_chdir ? $dir_pref . $FN : $FN))[3];
81793b90
GS
769
770 if (-d _) {
771 --$subcount;
544ff7a7 772 $FN =~ s/\.dir\z//i if $Is_VMS;
7bd31527
JH
773 # HACK: replace push to preserve dir traversal order
774 #push @Stack,[$CdLvl,$dir_name,$FN,$sub_nlink];
775 splice @Stack, $stack_top, 0,
776 [$CdLvl,$dir_name,$FN,$sub_nlink];
81793b90
GS
777 }
778 else {
7e47e6ff
JH
779 $name = $dir_pref . $FN; # $File::Find::name
780 $_= ($no_chdir ? $name : $FN); # $_
abfdd623 781 { $wanted_callback->() }; # protect against wild "next"
81793b90
GS
782 }
783 }
07867069 784 else {
7e47e6ff
JH
785 $name = $dir_pref . $FN; # $File::Find::name
786 $_= ($no_chdir ? $name : $FN); # $_
abfdd623 787 { $wanted_callback->() }; # protect against wild "next"
81793b90
GS
788 }
789 }
790 }
17b275ff
RA
791 }
792 continue {
57e73c4b 793 while ( defined ($SE = pop @Stack) ) {
81793b90
GS
794 ($Level, $p_dir, $dir_rel, $nlink) = @$SE;
795 if ($CdLvl > $Level && !$no_chdir) {
7e47e6ff 796 my $tmp;
862f843b 797 if ($Is_VMS) {
d8101854
CB
798 $tmp = '[' . ('-' x ($CdLvl-$Level)) . ']';
799 }
7e47e6ff
JH
800 else {
801 $tmp = join('/',('..') x ($CdLvl-$Level));
802 }
d8101854 803 die "Can't cd to $tmp from $dir_name"
7e47e6ff 804 unless chdir ($tmp);
81793b90
GS
805 $CdLvl = $Level;
806 }
7e47e6ff 807
b242981a
AD
808 if ($Is_Win32) {
809 $dir_name = ($p_dir =~ m{^(?:\w:[/\\]?|[/\\])$}
810 ? "$p_dir$dir_rel" : "$p_dir/$dir_rel");
3555aed3
SP
811 $dir_pref = "$dir_name/";
812 }
1e9c9d75
CB
813 elsif ($^O eq 'VMS') {
814 if ($p_dir =~ m/[\]>]+$/) {
815 $dir_name = $p_dir;
816 $dir_name =~ s/([\]>]+)$/.$dir_rel$1/;
817 $dir_pref = $dir_name;
818 }
819 else {
820 $dir_name = "$p_dir/$dir_rel";
821 $dir_pref = "$dir_name/";
822 }
823 }
7e47e6ff
JH
824 else {
825 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
826 $dir_pref = "$dir_name/";
827 }
828
719c805e 829 if ( $nlink == -2 ) {
7e47e6ff 830 $name = $dir = $p_dir; # $File::Find::name / dir
39e79f6b 831 $_ = $File::Find::current_dir;
abfdd623 832 $post_process->(); # End-of-directory processing
7e47e6ff
JH
833 }
834 elsif ( $nlink < 0 ) { # must be finddepth, report dirname now
835 $name = $dir_name;
862f843b
NC
836 if ( substr($name,-2) eq '/.' ) {
837 substr($name, length($name) == 2 ? -1 : -2) = '';
7e47e6ff 838 }
862f843b
NC
839 $dir = $p_dir;
840 $_ = ($no_chdir ? $dir_name : $dir_rel );
841 if ( substr($_,-2) eq '/.' ) {
842 substr($_, length($_) == 2 ? -1 : -2) = '';
7e47e6ff 843 }
abfdd623 844 { $wanted_callback->() }; # protect against wild "next"
7e47e6ff
JH
845 }
846 else {
847 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
848 last;
849 }
81793b90 850 }
a0d0e21e
LW
851 }
852}
853
81793b90
GS
854
855# API:
856# $wanted
857# $dir_loc : absolute location of a dir
858# $p_dir : "parent directory"
859# preconditions:
860# chdir (if not no_chdir) to dir
861
862sub _find_dir_symlnk($$$) {
7e47e6ff 863 my ($wanted, $dir_loc, $p_dir) = @_; # $dir_loc is the absolute directory
81793b90
GS
864 my @Stack;
865 my @filenames;
866 my $new_loc;
7e47e6ff 867 my $updir_loc = $dir_loc; # untainted parent directory
81793b90
GS
868 my $SE = [];
869 my $dir_name = $p_dir;
7e47e6ff
JH
870 my $dir_pref;
871 my $loc_pref;
39e79f6b 872 my $dir_rel = $File::Find::current_dir;
7e47e6ff
JH
873 my $byd_flag; # flag for pending stack entry if $bydepth
874 my $tainted = 0;
875 my $ok = 1;
876
862f843b
NC
877 $dir_pref = ( $p_dir eq '/' ? '/' : "$p_dir/" );
878 $loc_pref = ( $dir_loc eq '/' ? '/' : "$dir_loc/" );
81793b90
GS
879
880 local ($dir, $name, $fullname, $prune, *DIR);
7e47e6ff
JH
881
882 unless ($no_chdir) {
883 # untaint the topdir
884 if (( $untaint ) && (is_tainted($dir_loc) )) {
885 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|; # parent dir, now untainted
886 # once untainted, $updir_loc is pushed on the stack (as parent directory);
3555aed3
SP
887 # hence, we don't need to untaint the parent directory every time we chdir
888 # to it later
7e47e6ff 889 unless (defined $updir_loc) {
81793b90
GS
890 if ($untaint_skip == 0) {
891 die "directory $dir_loc is still tainted";
892 }
893 else {
894 return;
895 }
896 }
897 }
7e47e6ff
JH
898 $ok = chdir($updir_loc) unless ($p_dir eq $File::Find::current_dir);
899 unless ($ok) {
cd68ec93 900 warnings::warnif "Can't cd to $updir_loc: $!\n";
81793b90
GS
901 return;
902 }
903 }
904
7e47e6ff
JH
905 push @Stack,[$dir_loc,$updir_loc,$p_dir,$dir_rel,-1] if $bydepth;
906
81793b90
GS
907 while (defined $SE) {
908
909 unless ($bydepth) {
7e47e6ff 910 # change (back) to parent directory (always untainted)
704ea872 911 unless ($no_chdir) {
7e47e6ff 912 unless (chdir $updir_loc) {
cd68ec93 913 warnings::warnif "Can't cd to $updir_loc: $!\n";
704ea872
GS
914 next;
915 }
916 }
7e47e6ff
JH
917 $dir= $p_dir; # $File::Find::dir
918 $name= $dir_name; # $File::Find::name
919 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
920 $fullname= $dir_loc; # $File::Find::fullname
81793b90 921 # prune may happen here
7e47e6ff 922 $prune= 0;
704ea872 923 lstat($_); # make sure file tests with '_' work
abfdd623 924 { $wanted_callback->() }; # protect against wild "next"
7e47e6ff 925 next if $prune;
81793b90
GS
926 }
927
928 # change to that directory
7e47e6ff
JH
929 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
930 $updir_loc = $dir_loc;
931 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_loc) )) ) {
3555aed3 932 # untaint $dir_loc, what will be pushed on the stack as (untainted) parent dir
7e47e6ff
JH
933 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|;
934 unless (defined $updir_loc) {
81793b90
GS
935 if ($untaint_skip == 0) {
936 die "directory $dir_loc is still tainted";
a0d0e21e 937 }
237437d0 938 else {
81793b90 939 next;
237437d0 940 }
a0d0e21e
LW
941 }
942 }
7e47e6ff 943 unless (chdir $updir_loc) {
cd68ec93 944 warnings::warnif "Can't cd to $updir_loc: $!\n";
81793b90
GS
945 next;
946 }
947 }
948
7e47e6ff 949 $dir = $dir_name; # $File::Find::dir
81793b90
GS
950
951 # Get the list of files in the current directory.
7e47e6ff 952 unless (opendir DIR, ($no_chdir ? $dir_loc : $File::Find::current_dir)) {
cd68ec93 953 warnings::warnif "Can't opendir($dir_loc): $!\n";
81793b90
GS
954 next;
955 }
956 @filenames = readdir DIR;
957 closedir(DIR);
958
959 for my $FN (@filenames) {
a1ccf0c4
JM
960 if ($Is_VMS) {
961 # Big hammer here - Compensate for VMS trailing . and .dir
962 # No win situation until this is changed, but this
963 # will handle the majority of the cases with breaking the fewest.
964
965 $FN =~ s/\.dir\z//i;
966 $FN =~ s#\.$## if ($FN ne '.');
967 }
7e47e6ff 968 next if $FN =~ $File::Find::skip_pattern;
81793b90
GS
969
970 # follow symbolic links / do an lstat
07867069 971 $new_loc = Follow_SymLink($loc_pref.$FN);
81793b90
GS
972
973 # ignore if invalid symlink
3555aed3 974 unless (defined $new_loc) {
fab43c1b 975 if (!defined -l _ && $dangling_symlinks) {
615a2b9b
SP
976 if (ref $dangling_symlinks eq 'CODE') {
977 $dangling_symlinks->($FN, $dir_pref);
978 } else {
979 warnings::warnif "$dir_pref$FN is a dangling symbolic link\n";
980 }
981 }
982
983 $fullname = undef;
984 $name = $dir_pref . $FN;
985 $_ = ($no_chdir ? $name : $FN);
986 { $wanted_callback->() };
987 next;
988 }
7e47e6ff 989
81793b90 990 if (-d _) {
a1ccf0c4
JM
991 if ($Is_VMS) {
992 $FN =~ s/\.dir\z//i;
993 $FN =~ s#\.$## if ($FN ne '.');
994 $new_loc =~ s/\.dir\z//i;
995 $new_loc =~ s#\.$## if ($new_loc ne '.');
996 }
7e47e6ff 997 push @Stack,[$new_loc,$updir_loc,$dir_name,$FN,1];
81793b90
GS
998 }
999 else {
3555aed3 1000 $fullname = $new_loc; # $File::Find::fullname
7e47e6ff
JH
1001 $name = $dir_pref . $FN; # $File::Find::name
1002 $_ = ($no_chdir ? $name : $FN); # $_
abfdd623 1003 { $wanted_callback->() }; # protect against wild "next"
81793b90
GS
1004 }
1005 }
1006
81793b90
GS
1007 }
1008 continue {
57e73c4b 1009 while (defined($SE = pop @Stack)) {
7e47e6ff 1010 ($dir_loc, $updir_loc, $p_dir, $dir_rel, $byd_flag) = @$SE;
862f843b
NC
1011 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
1012 $dir_pref = "$dir_name/";
1013 $loc_pref = "$dir_loc/";
7e47e6ff
JH
1014 if ( $byd_flag < 0 ) { # must be finddepth, report dirname now
1015 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
3555aed3 1016 unless (chdir $updir_loc) { # $updir_loc (parent dir) is always untainted
cd68ec93 1017 warnings::warnif "Can't cd to $updir_loc: $!\n";
7e47e6ff
JH
1018 next;
1019 }
1020 }
1021 $fullname = $dir_loc; # $File::Find::fullname
1022 $name = $dir_name; # $File::Find::name
862f843b
NC
1023 if ( substr($name,-2) eq '/.' ) {
1024 substr($name, length($name) == 2 ? -1 : -2) = ''; # $File::Find::name
7e47e6ff 1025 }
862f843b
NC
1026 $dir = $p_dir; # $File::Find::dir
1027 $_ = ($no_chdir ? $dir_name : $dir_rel); # $_
1028 if ( substr($_,-2) eq '/.' ) {
1029 substr($_, length($_) == 2 ? -1 : -2) = '';
7e47e6ff
JH
1030 }
1031
1032 lstat($_); # make sure file tests with '_' work
abfdd623 1033 { $wanted_callback->() }; # protect against wild "next"
7e47e6ff
JH
1034 }
1035 else {
1036 push @Stack,[$dir_loc, $updir_loc, $p_dir, $dir_rel,-1] if $bydepth;
1037 last;
1038 }
a0d0e21e
LW
1039 }
1040 }
1041}
1042
81793b90 1043
20408e3c 1044sub wrap_wanted {
81793b90
GS
1045 my $wanted = shift;
1046 if ( ref($wanted) eq 'HASH' ) {
4c90698d
RB
1047 unless( exists $wanted->{wanted} and ref( $wanted->{wanted} ) eq 'CODE' ) {
1048 die 'no &wanted subroutine given';
1049 }
81793b90
GS
1050 if ( $wanted->{follow} || $wanted->{follow_fast}) {
1051 $wanted->{follow_skip} = 1 unless defined $wanted->{follow_skip};
1052 }
1053 if ( $wanted->{untaint} ) {
3555aed3 1054 $wanted->{untaint_pattern} = $File::Find::untaint_pattern
81793b90
GS
1055 unless defined $wanted->{untaint_pattern};
1056 $wanted->{untaint_skip} = 0 unless defined $wanted->{untaint_skip};
1057 }
1058 return $wanted;
1059 }
4c90698d 1060 elsif( ref( $wanted ) eq 'CODE' ) {
81793b90
GS
1061 return { wanted => $wanted };
1062 }
4c90698d
RB
1063 else {
1064 die 'no &wanted subroutine given';
1065 }
a0d0e21e
LW
1066}
1067
20408e3c 1068sub find {
81793b90
GS
1069 my $wanted = shift;
1070 _find_opt(wrap_wanted($wanted), @_);
a0d0e21e
LW
1071}
1072
55d729e4 1073sub finddepth {
81793b90
GS
1074 my $wanted = wrap_wanted(shift);
1075 $wanted->{bydepth} = 1;
1076 _find_opt($wanted, @_);
20408e3c 1077}
6280b799 1078
7e47e6ff
JH
1079# default
1080$File::Find::skip_pattern = qr/^\.{1,2}\z/;
1081$File::Find::untaint_pattern = qr|^([-+@\w./]+)$|;
1082
6280b799 1083# These are hard-coded for now, but may move to hint files.
10eba763 1084if ($^O eq 'VMS') {
81793b90 1085 $Is_VMS = 1;
7e47e6ff
JH
1086 $File::Find::dont_use_nlink = 1;
1087}
b242981a
AD
1088elsif ($^O eq 'MSWin32') {
1089 $Is_Win32 = 1;
1090}
748a9306 1091
7e47e6ff
JH
1092# this _should_ work properly on all platforms
1093# where File::Find can be expected to work
1094$File::Find::current_dir = File::Spec->curdir || '.';
1095
81793b90 1096$File::Find::dont_use_nlink = 1
b242981a 1097 if $^O eq 'os2' || $^O eq 'dos' || $^O eq 'amigaos' || $Is_Win32 ||
0c52c6a9 1098 $^O eq 'interix' || $^O eq 'cygwin' || $^O eq 'epoc' || $^O eq 'qnx' ||
1119cb72 1099 $^O eq 'nto';
6280b799 1100
20408e3c
GS
1101# Set dont_use_nlink in your hint file if your system's stat doesn't
1102# report the number of links in a directory as an indication
1103# of the number of files.
1104# See, e.g. hints/machten.sh for MachTen 2.2.
81793b90
GS
1105unless ($File::Find::dont_use_nlink) {
1106 require Config;
1107 $File::Find::dont_use_nlink = 1 if ($Config::Config{'dont_use_nlink'});
20408e3c
GS
1108}
1109
3555aed3
SP
1110# We need a function that checks if a scalar is tainted. Either use the
1111# Scalar::Util module's tainted() function or our (slower) pure Perl
7e47e6ff
JH
1112# fallback is_tainted_pp()
1113{
1114 local $@;
1115 eval { require Scalar::Util };
1116 *is_tainted = $@ ? \&is_tainted_pp : \&Scalar::Util::tainted;
1117}
1118
a0d0e21e 11191;