This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
podcheck.t: Deal properly with -1 counts
[perl5.git] / t / porting / podcheck.t
CommitLineData
a67b1afa 1#!/usr/bin/perl -w
f7b649f0 2
a67b1afa 3use strict;
16384ac1
KW
4use warnings;
5use feature 'unicode_strings';
6
7use Carp;
8use Digest;
9use File::Find;
10use File::Spec;
11use Scalar::Util;
12use Text::Tabs;
13
14BEGIN {
15 require '../regen/regen_lib.pl';
16}
17
18sub DEBUG { 0 };
19
20=pod
21
22=head1 NAME
23
24podcheck.t - Look for possible problems in the Perl pods
25
26=head1 SYNOPSIS
27
28 cd t
29 ./perl -I../lib porting/podcheck.t [--show_all] [--cpan] [--counts]
30 [ FILE ...]
31 ./perl -I../lib porting/podcheck.t --regen
32
33=head1 DESCRIPTION
34
35podcheck.t is an extension of Pod::Checker. It looks for pod errors and
36potential errors in the files given as arguments, or if none specified, in all
37pods in the distribution workspace, except those in the cpan directory (unless
38C<--cpan> is specified). It does additional checking beyond that done by
39Pod::Checker, and keeps a database of known potential problems, and will
40fail a pod only if the number of such problems differs from that given in the
41database. It also suppresses the C<(section) deprecated> message from
42Pod::Checker, since specifying the man page section number is quite proper to do.
43
44The additional checks it makes are:
45
46=over
47
48=item Cross-pod link checking
49
50Pod::Checker verifies that links to an internal target in a pod are not
51broken. podcheck.t extends that (when called without FILE arguments) to
52external links. It does this by gathering up all the possible targets in the
53workspace, and cross-checking them. The database has a list of known targets
54outside the workspace, so podcheck.t will not raise a warning for
55using those. It also checks that a non-broken link points to just one target.
56(The destination pod could have two targets with the same name.)
57
58=item An internal link that isn't so specified
59
60If a link is broken, but there is an existing internal target of the same
61name, it is likely that the internal target was meant, and the C<"/"> is
62missing from the C<LE<lt>E<gt>> pod command.
63
64=item Verbatim paragraphs that wrap in an 80 column window
65
66It's annoying to have lines wrap when displaying pod documentation in a
67terminal window. This checks that all such lines fit, and for those that
68don't, it tells you how much needs to be cut in order to fit. However,
69if you're fixing these, keep in mind that some terminal/pager combinations
70require really a maximum of 79 or 78 columns to display properly.
71
72Often, the easiest thing to do to gain space for these is to lower the indent
73to just one space.
74
75=item Missing or duplicate NAME or missing NAME short description
76
77A pod can't be linked to unless it has a unique name.
78And a NAME should have a dash and short description after it.
79
80=item =encoding statement issues
81
82This indicates if an C<=encoding> statement should be present, or moved to the
83front of the pod.
84
85=item Items that perhaps should be links
86
87There are mentions of apparent files in the pods that perhaps should be links
88instead, using C<LE<lt>...E<gt>>
89
90=item Items that perhaps should be C<FE<lt>...E<gt>>
91
92What look like path names enclosed in C<CE<lt>...E<gt>> should perhaps have
93C<FE<lt>...E<gt>> mark-up instead.
94
95=back
96
97A number of issues raised by podcheck.t and by the base Pod::Checker are not
98really problems, but merely potential problems. After inspecting them and
99deciding that they aren't real problems, it is possible to shut up this program
100about them, unlike base Pod::Checker. To do this, call podcheck.t with the
101C<--regen> option to regenerate the database. This tells it that all existing
102issues are to not be mentioned again.
103
104This isn't fool-proof. The database merely keeps track of the number of these
105potential problems of each type for each pod. If a new problem of a given
106type is introduced into the pod, podcheck.t will spit out all of them. You
107then have to figure out which is the new one, and should it be changed or not.
108But doing it this way insulates the database from having to keep track of line
109numbers of problems, which may change, or the exact wording of each problem
110which might also change without affecting whether it is a problem or not.
111
112Also, if the count of potential problems of a given type for a pod decreases,
113the database must be regenerated so that it knows the new number. The program
114gives instructions when this happens.
115
116There is currently no check that modules listed as valid in the data base
117actually are. Thus any errors introduced there will remain there.
118
119=head1 OPTIONS
120
121=over
122
123=item --regen
124
125Regenerate the data base used by podcheck.t to include all the existing
126potential problems. Future runs of the program will not then flag any of
127these.
128
129=item --cpan
130
131Normally, all pods in the cpan directory are skipped, except to make sure that
132any blead-upstream links to such pods are valid.
133This option will cause cpan upstream pods to be checked.
134
135=item --show_all
136
137Normally, if the number of potential problems of a given type found for a
138pod matches the expected value in the database, they will not be displayed.
139This option forces the database to be ignored during the run, so all potential
140problems are displayed and will fail their respective pod test. Specifying
141any particular FILES to operate on automatically selects this option.
142
143=item --counts
144
145Instead of testing, this just dumps the counts of the occurrences of the
146various types of potential problems in the data base.
147
148=back
149
150=head1 FILES
151
152The database is stored in F<t/porting/known_pod_issues.dat>
153
154=head1 SEE ALSO
155
156L<Pod::Checker>
157
158=cut
159
160#####################################################
161# HOW IT WORKS (in general)
162#
163# If not called with specific files to check, the directory structure is
164# examined for files that have pods in them. Files that might not have to be
165# fully parsed (e.g. in cpan) are parsed enough at this time to find their
166# pod's NAME, and to get a checksum.
167#
168# Those kinds of files are sorted last, but otherwise the pods are parsed with
169# the package coded here, My::Pod::Checker, which is an extension to
170# Pod::Checker that adds some tests and suppresses others that aren't
171# appropriate. The latter module has no provision for capturing diagnostics,
172# so a package, Tie_Array_to_FH, is used to force them to be placed into an
173# array instead of printed.
174#
175# Parsing the files builds up a list of links. The files are gone through
176# again, doing cross-link checking and outputting all saved-up problems with
177# each pod.
178#
179# Sorting the files last that potentially don't need to be fully parsed allows
180# us to not parse them unless there is a link to an internal anchor in them
181# from something that we have already parsed. Keeping checksums allows us to
182# not parse copies of other pods.
183#
184#####################################################
185
186# 1 => Exclude low priority messages that aren't likely to be problems, and
187# has many false positives; higher numbers give more messages.
188my $Warnings_Level = 200;
189
190# To see if two pods with the same NAME are actually copies of the same pod,
191# which is not an error, it uses a checksum to save work.
192my $digest_type = "SHA-1";
193
194my $original_dir = File::Spec->rel2abs(File::Spec->curdir);
195my $data_dir = File::Spec->catdir($original_dir, 'porting');
196my $known_issues = File::Spec->catfile($data_dir, 'known_pod_issues.dat');
197my $copy_fh;
198
199my $MAX_LINE_LENGTH = 80; # 80 columns
200my $INDENT = 8; # default nroff indent
201
202# Our warning messages. Better not have [('"] in them, as those are used as
203# delimiters for variable parts of the messages by poderror.
204my $line_length = "Verbatim line length including indents exceeds $MAX_LINE_LENGTH by";
205my $broken_link = "Apparent broken link";
206my $broken_internal_link = "Apparent internal link is missing its forward slash";
207my $see_not_linked = "? Should you be using L<...> instead of";
208my $C_with_slash = "? Should you be using F<...> or maybe L<...> instead of";
209my $multiple_targets = "There is more than one target";
210my $duplicate_name = "Pod NAME already used";
211my $need_encoding = "Should have =encoding statement because have non-ASCII";
212my $encoding_first = "=encoding must be first command (if present)";
213my $no_name = "There is no NAME";
214my $missing_name_description = "The NAME should have a dash and short description after it";
215
216# objects, tests, etc can't be pods, so don't look for them.
217my $non_pods = qr/\.(?:[achot]|zip|gz|bz2|jar|tar|tgz|PL|so)$/;
218
219
220# Pod::Checker messages to suppress
221my @suppressed_messages = (
222 "(section) in", # Checker is wrong to flag this
223 "multiple occurrence of link target", # We catch independently the ones
224 # that are real problems.
225 "unescaped <>",
226);
227
228sub suppressed {
229 # Returns bool as to if input message is one that is to be suppressed
230
231 my $message = shift;
232 return grep { $message =~ /^\Q$_/i } @suppressed_messages;
233}
234
235{ # Closure to contain a simple subset of test.pl. This is to get rid of the
236 # unnecessary 'failed at' messages that would otherwise be output pointing
237 # to a particular line in this file.
a67b1afa 238
16384ac1
KW
239 my $current_test = 0;
240 my $planned;
241
242 sub plan {
243 my %plan = @_;
244 $planned = $plan{tests};
245 print "1..$planned\n";
246 return;
247 }
248
249 sub ok {
250 my $success = shift;
251 my $message = shift;
252
253 chomp $message;
254
255 $current_test++;
256 print "not " unless $success;
257 print "ok $current_test - $message\n";
258 return;
259 }
260
261 sub skip {
262 my $why = shift;
263 my $n = @_ ? shift : 1;
264 for (1..$n) {
265 $current_test++;
266 print "ok $current_test # skip $why\n";
267 }
268 no warnings 'exiting';
269 last SKIP;
270 }
271
272 sub note {
273 my $message = shift;
274
275 chomp $message;
276
277 print $message =~ s/^/# /mgr;
278 print "\n";
279 return;
280 }
281
282 END {
283 if ($planned && $planned != $current_test) {
284 print STDERR
285 "# Looks like you planned $planned tests but ran $current_test.\n";
286 }
287 }
288}
289
290
291# List of known potential problems by pod and type.
292my %known_problems;
293
294# Pods given by the keys contain an interior node that is referred to from
295# outside it.
296my %has_referred_to_node;
297
298my $show_counts = 0;
299my $regen = 0;
300my $show_all = 0;
301
302# Assume that are to skip anything in /cpan
303my $do_upstream_cpan = 0;
304
305while (@ARGV && substr($ARGV[0], 0, 1) eq '-') {
306 my $arg = shift @ARGV;
307
308 $arg =~ s/^--/-/; # Treat '--' the same as a single '-'
309 if ($arg eq '-regen') {
310 $regen = 1;
311 }
312 elsif ($arg eq '-cpan') {
313 $do_upstream_cpan = 1;
314 }
315 elsif ($arg eq '-show_all') {
316 $show_all = 1;
317 }
318 elsif ($arg eq '-counts') {
319 $show_counts = 1;
320 }
321 else {
322 die <<EOF;
323Unknown option '$arg'
324
325Usage: $0 [ --regen | --cpan | --show_all ] [ FILE ... ]\n"
326 --cpan -> Include files in the cpan subdirectory.
327 --regen -> Regenerate the data file for $0
328 --show_all -> Show all known potential problems
329 --counts -> Don't test, but give summary counts of the currently
330 existing database
331EOF
332 }
333}
334
335my @files = @ARGV;
336
337if (($regen + $show_all + $show_counts + $do_upstream_cpan) > 1) {
338 croak "--regen, --show_all, --cpan, and --counts are mutually exclusive";
339}
340
341my $has_input_files = @files;
342
343if ($has_input_files && ($regen || $show_counts || $do_upstream_cpan)) {
344 croak "--regen, --counts and --cpan can't be used since using specific files";
345}
346
347our %problems; # potential problems found in this run
348
349package My::Pod::Checker { # Extend Pod::Checker
a67b1afa
MM
350 use parent 'Pod::Checker';
351
16384ac1
KW
352 # Uses inside out hash to protect from typos
353 # For new fields, remember to add to destructor DESTROY()
354 my %indents; # Stack of indents from =over's in effect for
355 # current line
356 my %current_indent; # Current line's indent
357 my %filename; # The pod is store in this file
358 my %skip; # is SKIP set for this pod
359 my %in_NAME; # true if within NAME section
360 my %in_begin; # true if within =begin section
361 my %linkable_item; # Bool: if the latest =item is linkable. It isn't
362 # for bullet and number lists
363 my %linkable_nodes; # Pod::Checker adds all =items to its node list,
364 # but not all =items are linkable to
365 my %seen_encoding_cmd; # true if have =encoding earlier
366 my %command_count; # Number of commands seen
367 my %seen_pod_cmd; # true if have =pod earlier
368 my %warned_encoding; # true if already have warned about =encoding
369 # problems
370
371 sub DESTROY {
372 my $addr = Scalar::Util::refaddr $_[0];
373 delete $command_count{$addr};
374 delete $current_indent{$addr};
375 delete $filename{$addr};
376 delete $in_begin{$addr};
377 delete $indents{$addr};
378 delete $in_NAME{$addr};
379 delete $linkable_item{$addr};
380 delete $linkable_nodes{$addr};
381 delete $seen_encoding_cmd{$addr};
382 delete $seen_pod_cmd{$addr};
383 delete $skip{$addr};
384 delete $warned_encoding{$addr};
385 return;
386 }
387
388 sub new {
389 my $class = shift;
390 my $filename = shift;
391
392 my $self = $class->SUPER::new(-quiet => 1,
393 -warnings => $Warnings_Level);
394 my $addr = Scalar::Util::refaddr $self;
395 $command_count{$addr} = 0;
396 $current_indent{$addr} = 0;
397 $filename{$addr} = $filename;
398 $in_begin{$addr} = 0;
399 $in_NAME{$addr} = 0;
400 $linkable_item{$addr} = 0;
401 $seen_encoding_cmd{$addr} = 0;
402 $seen_pod_cmd{$addr} = 0;
403 $warned_encoding{$addr} = 0;
404 return $self;
405 }
406
407 # re's for messages that Pod::Checker outputs
408 my $location = qr/ \b (?:in|at|on|near) \s+ /xi;
409 my $optional_location = qr/ (?: $location )? /xi;
410 my $line_reference = qr/ [('"]? $optional_location line\
411 (?: \d+ | EOF | \Q???\E | - )
412 [)'"]? /xi;
413
414 sub poderror { # Called to register a potential problem
415
416 # This adds an extra field to the parent hash, 'parameter'. It is
417 # used to extract the variable parts of a message leaving just the
418 # constant skeleton. This in turn allows the message to be
419 # categorized better, so that it shows up as a single type in our
420 # database, with the specifics of each occurrence not being stored with
421 # it.
422
423 my $self = shift;
424 my $opts = shift;
425
426 my $addr = Scalar::Util::refaddr $self;
427 return if $skip{$addr};
428
429 # Input can be a string or hash. If a string, parse it to separate
430 # out the line number and convert to a hash for easier further
431 # processing
432 my $message;
433 if (ref $opts ne 'HASH') {
434 $message = join "", $opts, @_;
435 my $line_number;
436 if ($message =~ s/\s*($line_reference)//) {
437 ($line_number = $1) =~ s/\s*$optional_location//;
438 }
439 else {
440 $line_number = '???';
441 }
442 $opts = { -msg => $message, -line => $line_number };
443 } else {
444 $message = $opts->{'-msg'};
445
446 }
447
448 $message =~ s/^\d+\s+//;
449 return if main::suppressed($message);
450
451 $self->SUPER::poderror($opts, @_);
452
453 $opts->{parameter} = "" unless $opts->{parameter};
454
455 # The variable parts of the message tend to be enclosed in '...',
456 # "....", or (...). Extract them and put them in an extra field,
457 # 'parameter'. This is trickier because the matching delimiter to a
458 # '(' is its mirror, and not itself. Text::Balanced could be used
459 # instead.
460 while ($message =~ m/ \s* $optional_location ( [('"] )/xg) {
461 my $delimiter = $1;
462 my $start = $-[0];
463 $delimiter = ')' if $delimiter eq '(';
464
465 # If there is no ending delimiter, don't consider it to be a
466 # variable part. Most likely it is a contraction like "Don't"
467 last unless $message =~ m/\G .+? \Q$delimiter/xg;
468
469 my $length = $+[0] - $start;
470
471 # Get the part up through the closing delimiter
472 my $special = substr($message, $start, $length);
473 $special =~ s/^\s+//; # No leading whitespace
474
475 # And add that variable part to the parameter, while removing it
476 # from the message. This isn't a foolproof way of finding the
477 # variable part. For example '(s)' can occur in e.g.,
478 # 'paragraph(s)'
479 if ($special ne '(s)') {
480 substr($message, $start, $length) = "";
481 pos $message = $start;
482 $opts->{-msg} = $message;
483 $opts->{parameter} .= " " if $opts->{parameter};
484 $opts->{parameter} .= $special;
485 }
486 }
487
488 # Extract any additional line number given. This is often the
489 # beginning location of something whereas the main line number gives
490 # the ending one.
491 if ($message =~ /( $line_reference )/xi) {
492 my $line_ref = $1;
493 while ($message =~ s/\s*\Q$line_ref//) {
494 $opts->{-msg} = $message;
495 $opts->{parameter} .= " " if $opts->{parameter};
496 $opts->{parameter} .= $line_ref;
497 }
498 }
499
500 carp("Couldn't extract line number from $message") if $message =~ /line \d+/;
501 push @{$problems{$filename{$addr}}{$message}}, $opts;
502 #push @{$problems{$self->get_filename}{$message}}, $opts;
503 }
504
505 sub check_encoding { # Does it need an =encoding statement?
506 my ($self, $paragraph, $line_num, $pod_para) = @_;
507
508 # Do nothing if there is an =encoding in the file, or if the line
509 # doesn't require an =encoding, or have already warned.
510 my $addr = Scalar::Util::refaddr $self;
511 return if $seen_encoding_cmd{$addr}
512 || $warned_encoding{$addr}
513 || $paragraph !~ /\P{ASCII}/;
514
515 $warned_encoding{$addr} = 1;
516 my ($file, $line) = $pod_para->file_line;
517 $self->poderror({ -line => $line, -file => $file,
518 -msg => $need_encoding
519 });
520 return;
521 }
522
523 sub verbatim {
524 my ($self, $paragraph, $line_num, $pod_para) = @_;
525 $self->check_encoding($paragraph, $line_num, $pod_para);
526
527 $self->SUPER::verbatim($paragraph, $line_num, $pod_para);
528
529 # Pick up the name, since the parent class doesn't in verbatim
530 # NAMEs; so treat as non-verbatim. The parent class only allows one
531 # paragraph in a NAME section, so if there is an extra blank line, it
532 # will trigger a message, but such a blank line is harmless, so skip
533 # in that case.
534 if ($in_NAME{Scalar::Util::refaddr $self} && $paragraph =~ /\S/) {
535 $self->textblock($paragraph, $line_num, $pod_para);
536 }
537
538 my @lines = split /^/, $paragraph;
539 for my $i (0 .. @lines - 1) {
540 $lines[$i] =~ s/\s+$//;
541 my $indent = $self->get_current_indent;
542 my $exceeds = length(Text::Tabs::expand($lines[$i]))
543 + $indent - $MAX_LINE_LENGTH;
544 next unless $exceeds > 0;
545 my ($file, $line) = $pod_para->file_line;
546 $self->poderror({ -line => $line + $i, -file => $file,
547 -msg => $line_length,
548 parameter => "+$exceeds (including " . ($indent - $INDENT) . " from =over's)",
549 });
550 }
551 }
552
553 sub textblock {
554 my ($self, $paragraph, $line_num, $pod_para) = @_;
555 $self->check_encoding($paragraph, $line_num, $pod_para);
556
557 $self->SUPER::textblock($paragraph, $line_num, $pod_para);
558
559 my ($file, $line) = $pod_para->file_line;
560 my $addr = Scalar::Util::refaddr $self;
561 if ($in_NAME{$addr}) {
562 if (! $self->name) {
563 my $text = $self->interpolate($paragraph, $line_num);
564 if ($text =~ /^\s*(\S+?)\s*$/) {
565 $self->name($1);
566 $self->poderror({ -line => $line, -file => $file,
567 -msg => $missing_name_description,
568 parameter => $1});
569 }
570 }
571 }
572 $paragraph = join " ", split /^/, $paragraph;
573
574 # Matches something that looks like a file name, but is enclosed in
575 # C<...>
576 my $C_path_re = qr{ \b ( C<
577 # exclude regexes and 'OS/2'
578 (?! (?: (?: s | qr | m) / ) | OS/2 > )
579 \w+ (?: / \w+ )+ > (?: \. \w+ )? )
580 }x;
581
582 # If looks like a reference to other documentation by containing the
583 # word 'See' and then a likely pod directive, warn.
584
585 while ($paragraph =~ m{ \b See \s+ ( ( [^L] ) <
586 ( [^<]*? ) # The not-< excludes nested C<L<...
587 > ) }ixg) {
588 my $construct = $1;
589 my $type = $2;
590 my $interior = $3;
591 if ($interior !~ /$non_pods/
592 && $construct !~ /$C_path_re/g) {
593 $self->poderror({ -line => $line, -file => $file,
594 -msg => $see_not_linked,
595 parameter => $construct
596 });
597 }
598 }
599 while ($paragraph =~ m/$C_path_re/g) {
600 my $construct = $1;
601 $self->poderror({ -line => $line, -file => $file,
602 -msg => $C_with_slash,
603 parameter => $construct
604 });
605 }
606 return;
607 }
608
609 sub command {
610 my ($self, $cmd, $paragraph, $line_num, $pod_para) = @_;
611 my $addr = Scalar::Util::refaddr $self;
612 if ($cmd eq "pod") {
613 $seen_pod_cmd{$addr}++;
614 }
615 elsif ($cmd eq "encoding") {
616 my ($file, $line) = $pod_para->file_line;
617 $seen_encoding_cmd{$addr} = 1;
618 if ($command_count{$addr} != 1 && $seen_pod_cmd{$addr}) {
619 $self->poderror({ -line => $line, -file => $file,
620 -msg => $encoding_first
621 });
622 }
623 }
624 $self->check_encoding($paragraph, $line_num, $pod_para);
625
626 # Pod::Check treats all =items as linkable, but the bullet and
627 # numbered lists really aren't. So keep our own list. This has to be
628 # processed before SUPER is called so that the list is started before
629 # the rest of it gets parsed.
630 if ($cmd eq 'item') { # Not linkable if item begins with * or a digit
631 $linkable_item{$addr} = ($paragraph !~ / ^ \s*
632 (?: [*]
633 | \d+ \.? (?: \$ | \s+ )
634 )/x)
635 ? 1
636 : 0;
a67b1afa 637
16384ac1
KW
638 }
639 $self->SUPER::command($cmd, $paragraph, $line_num, $pod_para);
640
641 $command_count{$addr}++;
642
643 $in_NAME{$addr} = 0; # Will change to 1 below if necessary
644 $in_begin{$addr} = 0; # ibid
645 if ($cmd eq 'over') {
646 my $text = $self->interpolate($paragraph, $line_num);
647 my $indent = 4; # default
648 $indent = $1 if $text && $text =~ /^\s*(\d+)\s*$/;
649 push @{$indents{$addr}}, $indent;
650 $current_indent{$addr} += $indent;
651 }
652 elsif ($cmd eq 'back') {
653 if (@{$indents{$addr}}) {
654 $current_indent{$addr} -= pop @{$indents{$addr}};
655 }
656 else {
657 # =back without corresponding =over, but should have
658 # warned already
659 $current_indent{$addr} = 0;
660 }
661 }
662 elsif ($cmd =~ /^head/) {
663 if (! $in_begin{$addr}) {
664
665 # If a particular formatter, then this command doesn't really
666 # apply
667 $current_indent{$addr} = 0;
668 undef @{$indents{$addr}};
669 }
670
671 my $text = $self->interpolate($paragraph, $line_num);
672 $in_NAME{$addr} = 1 if $cmd eq 'head1'
673 && $text && $text =~ /^NAME\b/;
674 }
675 elsif ($cmd eq 'begin') {
676 $in_begin{$addr} = 1;
677 }
678
679 return;
680 }
681
682 sub hyperlink {
683 my $self = shift;
684
685 # If the hyperlink is to an interior node of another page, save it
686 # so that we can see if we need to parse normally skipped files.
687 $has_referred_to_node{$_[0][1]{'-page'}} = 1
688 if $_[0] && $_[0][1]{'-page'} && $_[0][1]{'-node'};
689 return $self->SUPER::hyperlink($_[0]);
690 }
691
692 sub node {
693 my $self = shift;
694 my $text = $_[0];
695 if($text) {
696 $text =~ s/\s+$//s; # strip trailing whitespace
697 $text =~ s/\s+/ /gs; # collapse whitespace
698 my $addr = Scalar::Util::refaddr $self;
699 push(@{$linkable_nodes{$addr}}, $text) if
700 ! $current_indent{$addr}
701 || $linkable_item{$addr};
702 }
703 return $self->SUPER::node($_[0]);
704 }
705
706 sub get_current_indent {
707 return $INDENT + $current_indent{Scalar::Util::refaddr $_[0]};
708 }
709
710 sub get_filename {
711 return $filename{Scalar::Util::refaddr $_[0]};
712 }
713
714 sub linkable_nodes {
715 my $linkables = $linkable_nodes{Scalar::Util::refaddr $_[0]};
716 return undef unless $linkables;
717 return @$linkables;
718 }
719
720 sub get_skip {
721 return $skip{Scalar::Util::refaddr $_[0]} // 0;
722 }
723
724 sub set_skip {
725 my $self = shift;
726 $skip{Scalar::Util::refaddr $self} = shift;
727
728 # If skipping, no need to keep the problems for it
729 delete $problems{$self->get_filename};
730 return;
731 }
732}
733
734package Tie_Array_to_FH { # So printing actually goes to an array
735
736 my %array;
737
738 sub TIEHANDLE {
739 my $class = shift;
740 my $array_ref = shift;
741
742 my $self = bless \do{ my $anonymous_scalar }, $class;
743 $array{Scalar::Util::refaddr $self} = $array_ref;
744
745 return $self;
746 }
747
748 sub PRINT {
a67b1afa 749 my $self = shift;
16384ac1
KW
750 push @{$array{Scalar::Util::refaddr $self}}, @_;
751 return 1;
752 }
a67b1afa
MM
753}
754
69f6a9a1 755
16384ac1
KW
756my %filename_to_checker; # Map a filename to it's pod checker object
757my %id_to_checker; # Map a checksum to it's pod checker object
758my %nodes; # key is filename, values are nodes in that file.
759my %nodes_first_word; # same, but value is first word of each node
760my %valid_modules; # List of modules known to exist outside us.
761my %digests; # checksums of files, whose names are the keys
762my %filename_to_pod; # Map a filename to its pod NAME
763my %files_with_unknown_issues;
764my %files_with_fixes;
69f6a9a1 765
16384ac1
KW
766my $data_fh;
767open($data_fh, $known_issues) || die "Can't open $known_issues";
69f6a9a1 768
16384ac1
KW
769my %counts; # For --counts param, count of each issue type
770my %suppressed_files; # Files with at least one issue type to suppress
771
772while (<$data_fh>) { # Read the data base
69f6a9a1 773 chomp;
16384ac1
KW
774 next if /^\s*(?:#|$)/; # Skip comment and empty lines
775 if (/\t/) {
776 next if $show_all;
777
778 # Keep track of counts of each issue type for each file
779 my ($filename, $message, $count) = split /\t/;
780 $known_problems{$filename}{$message} = $count;
781
782 if ($show_counts) {
783 if ($count < 0) { # -1 means to suppress this issue type
784 $suppressed_files{$filename} = $filename;
785 }
786 else {
787 $counts{$message} += $count;
788 }
789 }
790 }
791 else { # Lines without a tab are modules known to be valid
792 $valid_modules{$_} = 1
793 }
794}
795close $data_fh;
796
797if ($show_counts) {
798 my $total = 0;
799 foreach my $message (sort keys %counts) {
800 $total += $counts{$message};
801 note(Text::Tabs::expand("$counts{$message}\t$message"));
802 }
803 note("-----\n" . Text::Tabs::expand("$total\tknown potential issues"));
804 if (%suppressed_files) {
805 note("\nFiles that have all messages of at least one type suppressed:");
806 note(join ",", keys %suppressed_files);
807 }
808 exit 0;
809}
810
811
812my %excluded_files = (
813 "lib/unicore/mktables" => 1,
814 "Porting/perldelta_template.pod" => 1,
815 "autodoc.pl" => 1,
816 "configpm" => 1,
817 "miniperl" => 1,
818 "perl" => 1,
819 );
820
821# re to match files that are to be parsed only if there is an internal link
822# to them. It does not include cpan, as whether those are parsed depends
823# on a switch. Currently, only the stable perldelta.pod's are included.
824# These all have characters between 'perl' and 'delta'. (Actually the
825# currently developed one matches as well, but is a duplicate of
826# perldelta.pod, so can be skipped, so fine for it to match this.
827my $only_for_interior_links_re = qr/ \b perl \d+ delta \. pod \b/x;
828
829{ # Closure
830 my $first_time = 1;
831
832 sub output_thanks ($$$$) { # Called when an issue has been fixed
833 my $filename = shift;
834 my $original_count = shift;
835 my $current_count = shift;
836 my $message = shift;
837
838 $files_with_fixes{$filename} = 1;
839 my $return;
840 my $fixed_count = $original_count - $current_count;
841 my $a_problem = ($fixed_count == 1) ? "a problem" : "multiple problems";
842 my $another_problem = ($fixed_count == 1) ? "another problem" : "another set of problems";
843 my $diff;
844 if ($message) {
845 $diff = <<EOF;
846There were $original_count occurrences (now $current_count) in this pod of type
847"$message",
848EOF
849 } else {
850 $diff = <<EOF;
851There are no longer any problems found in this pod!
852EOF
853 }
854
855 if ($first_time) {
856 $first_time = 0;
857 $return = <<EOF;
858Thanks for fixing $a_problem!
859$diff
860Now you must teach $0 that this was fixed.
861EOF
862 }
863 else {
864 $return = <<EOF
865Thanks for fixing $another_problem.
866$diff
867EOF
868 }
869
870 return $return;
871 }
872}
873
874sub my_safer_print { # print, with error checking for outputting to db
875 my ($fh, @lines) = @_;
876
877 if (! print $fh @lines) {
878 my $save_error = $!;
879 close($fh);
880 die "Write failure: $save_error";
881 }
882}
883
884sub extract_pod { # Extracts just the pod from a file
885 my $filename = shift;
886
887 my @pod;
888
889 # Arrange for the output of Pod::Parser to be collected in an array we can
890 # look at instead of being printed
891 tie *ALREADY_FH, 'Tie_Array_to_FH', \@pod;
892 open my $in_fh, '<', $filename
893 or die "Can't open '$filename': $!\n";
894
895 my $parser = Pod::Parser->new();
896 $parser->parse_from_filehandle($in_fh, *ALREADY_FH);
897 close $in_fh;
898
899 return join "", @pod
900}
901
902my $digest = Digest->new($digest_type);
903
904sub is_pod_file {
905 if (-d $_) {
906 # Don't look at files in directories that are for tests, nor those
907 # beginning with a dot
908 if ($_ eq 't' || $_ =~ /^\../) {
909 $File::Find::prune = 1;
910 }
911 return;
912 }
913
914 return if $_ =~ /^\./; # No hidden Unix files
915 return if $_ =~ $non_pods;
916
917 my $filename = $File::Find::name;
918
919 # Assumes that the path separator is exactly one character.
920 $filename =~ s/^\..//;
921
922 return if $excluded_files{$filename};
923
924 open my $candidate, '<', $_
925 or die "Can't open '$File::Find::name': $!\n";
926 my @contents = <$candidate>;
927 close $candidate;
928
929 # If the file is a .pm or .pod, having any initial '=' on a line is
930 # grounds for testing it. Otherwise, require a head1 NAME line to view it
931 # as a potential pod
932 my $i;
933 my $found = "";
934 for ($i = 0; $i < @contents; $i++) {
935 next unless $contents[$i] =~ /^=/;
936 if ($filename =~ /\.(?:pm|pod)/) {
937 $found = 'found_some_pod_line';
938 last;
939 }
940 elsif ($contents[$i] =~ /^=head1 +NAME/) {
941 $found = 'found_NAME';
942 last;
943 }
944 }
945 if ($found) {
946 # Here, we know that the file is a pod. Add it to the list of files
947 # to check and create a checker object for it.
948
949 push @files, $filename;
950 my $checker = My::Pod::Checker->new($filename);
951 $filename_to_checker{$filename} = $checker;
952
953 # In order to detect duplicate pods and only analyze them once, we
954 # compute checksums for the file, so don't have to do an exact
955 # compare. Note that if the pod is just part of the file, the
956 # checksums can differ for the same pod. That special case is handled
957 # later, since if the checksums of the whole file are the same, that
958 # case won't even come up. We don't need the checksums for files that
959 # we parse only if there is a link to its interior, but we do need its
960 # NAME, which is also retrieved in the code below.
961 if ($filename =~ / (?: ^(cpan|lib|ext|dist)\/ )
962 | $only_for_interior_links_re
963 /x) {
964 $digest->add(@contents);
965 $digests{$filename} = $digest->digest;
966
967 # lib files aren't analyzed if they are duplicates of files copied
968 # there from some other directory. But to determine this, we need
969 # to know their NAMEs. We might as well find the NAME now while
970 # the file is open. Similarly, cpan files aren't analyzed unless
971 # we're analyzing all of them, or this particular file is linked
972 # to by a file we are analyzing, and thus we will want to verify
973 # that the target exists in it. We need to know at least the NAME
974 # to see if it's worth analyzing, or so we can determine if a lib
975 # file is a copy of a cpan one.
976 if ($filename =~ m{ (?: ^ (?: cpan | lib ) / )
977 | $only_for_interior_links_re
978 }x) {
979 if ($found eq 'found_some_pod_line') {
980 for (; $i < @contents; $i++) {
981 next if $contents[$i] !~ /^=head1/;
982 $found = 'found_NAME'
983 if $contents[$i] =~ /^=head1 +NAME/;
984 last;
985 }
986 }
987 if ($found eq 'found_NAME') {
988 $i++; # The NAME starts on a later line
989
990 # Skip empty lines
991 while ($contents[$i] !~ /\S/) { $i++ }
992
993 # The NAME is the first non-spaces on the line up to a
994 # comma, dash or end of line. Otherwise, it's invalid and
995 # this pod doesn't have a legal name that we're smart
996 # enough to find currently. But the parser will later
997 # find it if it thinks there is a legal name, and set the
998 # name
999 if ($contents[$i] =~ /^ \s* ( \S+?) \s* (?: [,-] | $ )/x) {
1000 my $name = $1;
1001 $checker->name($name);
1002 $id_to_checker{$name} = $checker
1003 if $filename =~ m{^cpan/};
1004 }
1005 }
1006 elsif ($filename =~ m{^cpan/}) {
1007 $id_to_checker{$digests{$filename}} = $checker;
1008 }
1009 }
1010 }
1011 }
1012} # End of is_pod_file()
1013
1014# Start of real code that isn't processing the command line.
1015# Here, @files contains list of files on the command line. If have any of
1016# these, unconditionally test them, and show all the errors, even the known
1017# ones, and, since not testing other pods, don't do cross-pod link tests.
1018# (Could add extra code to do cross-pod tests for the ones in the list.)
1019if ($has_input_files) {
1020 undef %known_problems;
1021 $do_upstream_cpan = 1; # In case one of the inputs is from cpan
1022
1023}
1024else { # No input files -- go find all the possibilities.
1025 if ($regen) {
1026 $copy_fh = open_new($known_issues);
1027 note("Regenerating $known_issues, please be patient...");
1028 print $copy_fh <<END;
1029# This file is the data file for $0.
1030# There are three types of lines.
1031# Comment lines are white-space only or begin with a '#', like this one. Any
1032# changes you make to the comment lines will be lost when the file is
1033# regen'd.
1034# Lines without tab characters are simply NAMES of pods that the program knows
1035# will have links to them and the program does not check if those links are
1036# valid.
1037# All other lines should have three fields, each separated by a tab. The
1038# first field is the name of a pod; the second field is an error message
1039# generated by this program; and the third field is a count of how many
1040# known instances of that message there are in the pod. -1 means that the
1041# program can expect any number of this type of message.
1042END
1043 }
1044
1045 # Move to the directory above us, but have to adjust @INC to account for
1046 # that.
1047 s{^\.\./lib$}{lib} for @INC;
1048 chdir File::Spec->updir;
1049
1050 # And look in this directory and all its subdirectories
1051 find( \&is_pod_file, '.');
1052
1053 # Add ourselves to the test
1054 push @files, 't/porting/podcheck.t';
1055}
1056
1057# Now we know how many tests there will be.
1058plan (tests => scalar @files) if ! $regen;
1059
1060
1061 # Sort file names so we get consistent results, and to put cpan last,
1062 # preceeded by the ones that we don't generally parse. This is because both
1063 # these classes are generally parsed only if there is a link to the interior
1064 # of them, and we have to parse all others first to guarantee that they don't
1065 # have such a link. 'lib' files come just before these, as some of these are
1066 # duplicates of others. We already have figured this out when gathering the
1067 # data as a special case for all such files, but this, while unnecessary,
1068 # puts the derived file last in the output. 'readme' files come before those,
1069 # as those also could be duplicates of others, which are considered the
1070 # primary ones. These currently aren't figured out when gathering data, so
1071 # are done here.
1072 @files = sort { if ($a =~ /^cpan/) {
1073 return 1 if $b !~ /^cpan/;
1074 return $a cmp $b;
1075 }
1076 elsif ($b =~ /^cpan/) {
1077 return -1;
1078 }
1079 elsif ($a =~ /$only_for_interior_links_re/) {
1080 return 1 if $b !~ /$only_for_interior_links_re/;
1081 return $a cmp $b;
1082 }
1083 elsif ($b =~ /$only_for_interior_links_re/) {
1084 return -1;
1085 }
1086 elsif ($a =~ /^lib/) {
1087 return 1 if $b !~ /^lib/;
1088 return $a cmp $b;
1089 }
1090 elsif ($b =~ /^lib/) {
1091 return -1;
1092 } elsif ($a =~ /\breadme\b/i) {
1093 return 1 if $b !~ /\breadme\b/i;
1094 return $a cmp $b;
1095 }
1096 elsif ($b =~ /\breadme\b/i) {
1097 return -1;
1098 }
1099 else {
1100 return lc $a cmp lc $b;
1101 }
1102 }
1103 @files;
1104
1105# Now go through all the files and parse them
1106foreach my $filename (@files) {
1107 my $parsed = 0;
1108 note("parsing $filename") if DEBUG;
1109
1110 # We may have already figured out some things in the process of generating
1111 # the file list. If so, have a $checker object already. But if not,
1112 # generate one now.
1113 my $checker = $filename_to_checker{$filename};
1114 if (! $checker) {
1115 $checker = My::Pod::Checker->new($filename);
1116 $filename_to_checker{$filename} = $checker;
1117 }
1118
1119 # We have set the name in the checker object if there is a possibility
1120 # that no further parsing is necessary, but otherwise do the parsing now.
1121 if (! $checker->name) {
1122 $parsed = 1;
1123 $checker->parse_from_file($filename, undef);
1124 }
1125
1126 if ($checker->num_errors() < 0) { # Returns negative if not a pod
1127 $checker->set_skip("$filename is not a pod");
1128 }
1129 else {
1130
1131 # Here, is a pod. See if it is one that has already been tested,
1132 # or should be tested under another directory. Use either its NAME
1133 # if it has one, or a checksum if not.
1134 my $name = $checker->name;
1135 my $id;
1136
1137 if ($name) {
1138 $id = $name;
1139 }
1140 else {
1141 my $digest = Digest->new($digest_type);
1142 $digest->add(extract_pod($filename));
1143 $id = $digest->digest;
1144 }
1145
1146 # If there is a match for this pod with something that we've already
1147 # processed, don't process it, and output why.
1148 my $prior_checker;
1149 if (defined ($prior_checker = $id_to_checker{$id})
1150 && $prior_checker != $checker) # Could have defined the checker
1151 # earlier without pursuing it
1152 {
1153
1154 # If the pods are identical, then it's just a copy, and isn't an
1155 # error. First use the checksums we have already computed to see
1156 # if the entire files are identical, which means that the pods are
1157 # identical too.
1158 my $prior_filename = $prior_checker->get_filename;
1159 my $same = (! $name
1160 || ($digests{$prior_filename}
1161 && $digests{$filename}
1162 && $digests{$prior_filename} eq $digests{$filename}));
1163
1164 # If they differ, it could be that the files differ for some
1165 # reason, but the pods they contain are identical. Extract the
1166 # pods and do the comparisons on just those.
1167 if (! $same && $name) {
1168 $same = extract_pod($prior_filename) eq extract_pod($filename);
1169 }
1170
1171 if ($same) {
1172 $checker->set_skip("The pod of $filename is a duplicate of "
1173 . "the pod for $prior_filename");
1174 } elsif ($prior_filename =~ /\breadme\b/i) {
1175 $checker->set_skip("$prior_filename is a README apparently for $filename");
1176 } elsif ($filename =~ /\breadme\b/i) {
1177 $checker->set_skip("$filename is a README apparently for $prior_filename");
1178 } else { # Here have two pods with identical names that differ
1179 $prior_checker->poderror(
1180 { -msg => $duplicate_name,
1181 -line => "???",
1182 parameter => "'$filename' also has NAME '$name'"
1183 });
1184 $checker->poderror(
1185 { -msg => $duplicate_name,
1186 -line => "???",
1187 parameter => "'$prior_filename' also has NAME '$name'"
1188 });
1189
1190 # Changing the names helps later.
1191 $prior_checker->name("$name version arbitrarily numbered 1");
1192 $checker->name("$name version arbitrarily numbered 2");
1193 }
1194
1195 # In any event, don't process this pod that has the same name as
1196 # another.
1197 next;
1198 }
1199
1200 # A unique pod.
1201 $id_to_checker{$id} = $checker;
1202
1203 my $parsed_for_links = ", but parsed for its interior links";
1204 if ((! $do_upstream_cpan && $filename =~ /^cpan/)
1205 || $filename =~ $only_for_interior_links_re)
1206 {
1207 if ($filename =~ /^cpan/) {
1208 $checker->set_skip("CPAN is upstream for $filename");
1209 }
1210 elsif ($filename =~ /perl\d+delta/) {
1211 $checker->set_skip("$filename is a stable perldelta");
1212 }
1213 else {
1214 croak("Unexpected file '$filename' encountered that has parsing for interior-linking only");
1215 }
1216
1217 if ($name && $has_referred_to_node{$name}) {
1218 $checker->set_skip($checker->get_skip() . $parsed_for_links);
1219 }
1220 }
1221
1222 # Need a name in order to process it, because not meaningful
1223 # otherwise, and also can't test links to this without a name.
1224 if (!defined $name) {
1225 $checker->poderror( { -msg => $no_name,
1226 -line => '???'
1227 });
1228 next;
1229 }
1230
1231 # For skipped files, just get its NAME
1232 my $skip;
1233 if (($skip = $checker->get_skip()) && $skip !~ /$parsed_for_links/)
1234 {
1235 $checker->node($name) if $name;
1236 }
1237 else {
1238 $checker->parse_from_file($filename, undef) if ! $parsed;
1239 }
1240
1241 # Go through everything in the file that could be an anchor that
1242 # could be a link target. Count how many there are of the same name.
1243 foreach my $node ($checker->linkable_nodes) {
1244 next if ! $node; # Can be empty is like '=item *'
1245 if (exists $nodes{$name}{$node}) {
1246 $nodes{$name}{$node}++;
1247 }
1248 else {
1249 $nodes{$name}{$node} = 1;
1250 }
1251
1252 # Experiments have shown that cpan search can figure out the
1253 # target of a link even if the exact wording is incorrect, as long
1254 # as the first word is. This happens frequently in perlfunc.pod,
1255 # where the link will be just to the function, but the target
1256 # entry also includes parameters to the function.
1257 my $first_word = $node;
1258 if ($first_word =~ s/^(\S+)\s+\S.*/$1/) {
1259 $nodes_first_word{$name}{$first_word} = $node;
1260 }
1261 }
1262 $filename_to_pod{$filename} = $name;
1263 }
1264}
1265
1266# Here, all files have been parsed, and all links and link targets are stored.
1267# Now go through the files again and see which don't have matches.
1268if (! $has_input_files) {
1269 foreach my $filename (@files) {
1270 next if $filename_to_checker{$filename}->get_skip;
1271 my $checker = $filename_to_checker{$filename};
1272 foreach my $link ($checker->hyperlink) {
1273 my $linked_to_page = $link->[1]->page;
1274 next unless $linked_to_page; # intra-file checks are handled by std
1275 # Pod::Checker
1276
1277 # Initialize the potential message.
1278 my %problem = ( -msg => $broken_link,
1279 -line => $link->[0],
1280 parameter => "to \"$linked_to_page\"",
1281 );
1282
1283 # See if we have found the linked-to_file in our parse
1284 if (exists $nodes{$linked_to_page}) {
1285 my $node = $link->[1]->node;
1286
1287 # If link is only to the page-level, already have it
1288 next if ! $node;
1289
1290 # Transform pod language to what we are expecting
1291 $node =~ s,E<sol>,/,g;
1292 $node =~ s/E<verbar>/|/g;
1293
1294 # If link is to a node that exists in the file, is ok
1295 if ($nodes{$linked_to_page}{$node}) {
1296
1297 # But if the page has multiple targets with the same name,
1298 # it's ambiguous which one this should be to.
1299 if ($nodes{$linked_to_page}{$node} > 1) {
1300 $problem{-msg} = $multiple_targets;
1301 $problem{parameter} = "in $linked_to_page that $node could be pointing to";
1302 $checker->poderror(\%problem);
1303 }
1304 } elsif (! $nodes_first_word{$linked_to_page}{$node}) {
1305
1306 # Here the link target was not found, either exactly or to
1307 # the first word. Is an error.
1308 $problem{parameter} =~ s,"$,/$node",;
1309 $checker->poderror(\%problem);
1310 }
1311
1312 } # Linked-to-file not in parse; maybe is in exception list
1313 elsif (! exists $valid_modules{$link->[1]->page}) {
1314
1315 # Here, is a link to a target that we can't find. Check if
1316 # there is an internal link on the page with the target name.
1317 # If so, it could be that they just forgot the initial '/'
1318 if ($filename_to_pod{$filename}
1319 && $nodes{$filename_to_pod{$filename}}{$linked_to_page})
1320 {
1321 $problem{-msg} = $broken_internal_link;
1322 }
1323 $checker->poderror(\%problem);
1324 }
1325 }
1326 }
1327}
1328
1329# If regenerating the data file, start with the modules for which we don't
1330# check targets
1331if ($regen) {
1332 foreach (sort { lc $a cmp lc $b } keys %valid_modules) {
1333 my_safer_print($copy_fh, $_, "\n");
1334 }
1335}
1336
1337# Now ready to output the messages.
1338foreach my $filename (@files) {
1339 my $test_name = "POD of $filename";
1340 SKIP: {
1341 my $skip = $filename_to_checker{$filename}->get_skip // "";
1342
1343 if ($regen) {
1344 foreach my $message ( sort keys %{$problems{$filename}}) {
1345 my $count;
1346
1347 # Preserve a negative setting.
1348 if ($known_problems{$filename}{$message}
1349 && $known_problems{$filename}{$message} < 0)
1350 {
1351 $count = $known_problems{$filename}{$message};
1352 }
1353 else {
1354 $count = @{$problems{$filename}{$message}};
1355 }
1356 my_safer_print($copy_fh, "$filename\t$message\t$count\n");
1357 }
1358 next;
1359 }
1360
1361 skip($skip, 1) if $skip;
1362 my @diagnostics;
1363 my $indent = ' ';
1364
16384ac1
KW
1365 my $total_known = 0;
1366 foreach my $message ( sort keys %{$problems{$filename}}) {
1367 $known_problems{$filename}{$message} = 0
1368 if ! $known_problems{$filename}{$message};
1369 my $diagnostic = "";
1370 my $problem_count = scalar @{$problems{$filename}{$message}};
1371 $total_known += $problem_count;
1372 next if $known_problems{$filename}{$message} < 0;
1373 if ($problem_count > $known_problems{$filename}{$message}) {
1374
1375 # Here we are about to output all the messages for this type,
1376 # subtract back this number we previously added in.
1377 $total_known -= $problem_count;
1378
1379 $diagnostic .= $indent . $message;
1380 if ($problem_count > 2) {
1381 $diagnostic .= " ($problem_count occurrences)";
1382 }
1383 foreach my $problem (@{$problems{$filename}{$message}}) {
1384 $diagnostic .= " " if $problem_count == 1;
1385 $diagnostic .= "\n$indent$indent";
1386 $diagnostic .= "$problem->{parameter}" if $problem->{parameter};
1387 $diagnostic .= " near line $problem->{-line}";
1388 $diagnostic .= " $problem->{comment}" if $problem->{comment};
1389 }
1390 $diagnostic .= "\n";
1391 $files_with_unknown_issues{$filename} = 1;
1392 } elsif ($problem_count < $known_problems{$filename}{$message}) {
1393 $diagnostic = output_thanks($filename, $known_problems{$filename}{$message}, $problem_count, $message);
1394 }
1395 push @diagnostics, $diagnostic if $diagnostic;
1396 }
1397
09ea063a
KW
1398 # The above loop has output messages where there are current potential
1399 # issues. But it misses where there were some that have been entirely
1400 # fixed. For those, we need to look through the old issues
1401 foreach my $message ( sort keys %{$known_problems{$filename}}) {
1402 next if $problems{$filename}{$message};
1403 next if ! $known_problems{$filename}{$message};
2497e1df 1404 next if $known_problems{$filename}{$message} < 0; # Preserve negs
09ea063a
KW
1405 my $diagnostic = output_thanks($filename, $known_problems{$filename}{$message}, 0, $message);
1406 push @diagnostics, $diagnostic if $diagnostic;
1407 }
1408
16384ac1
KW
1409 my $output = "POD of $filename";
1410 $output .= ", excluding $total_known not shown known potential problems"
1411 if $total_known;
1412 ok(@diagnostics == 0, $output);
1413 if (@diagnostics) {
1414 note(join "", @diagnostics,
1415 "See end of this test output for your options");
1416 }
1417 }
1418}
1419
1420my $how_to = <<EOF;
1421 run this test script by hand, using the following formula (on
1422 Un*x-like machines):
1423 cd t
1424 ./perl -I../lib porting/podcheck.t --regen
1425EOF
1426
1427if (%files_with_unknown_issues) {
1428 my $were_count_files = scalar keys %files_with_unknown_issues;
1429 $were_count_files = ($were_count_files == 1)
1430 ? "was $were_count_files file"
1431 : "were $were_count_files files";
1432 my $message = <<EOF;
1433
1434HOW TO GET THIS .t TO PASS
1435
1436There $were_count_files that had new potential problems identified. To get
1437this .t to pass, do the following:
1438
14391) If a problem is about a link to an unknown module that you know exists,
1440 simply edit the file,
1441 $known_issues
1442 and add anywhere a line that contains just the module's name.
1443 (Don't do this for a module that you aren't sure about; instead treat
1444 as another type of issue and follow the instructions below.)
1445
14462) For other issues, decide if each should be fixed now or not. Fix the
1447 ones you decided to, and rerun this test to verify that the fixes
1448 worked.
1449
14503) If there remain potential problems that you don't plan to fix right
1451 now (or aren't really problems),
1452$how_to
1453 That should cause all current potential problems to be accepted by the
1454 program, so that the next time it runs, they won't be flagged.
1455EOF
1456 if (%files_with_fixes) {
1457 $message .= " This step will also take care of the files that have fixes in them\n";
1458 }
1459
1460 $message .= <<EOF;
1461 For a few files, such as perltoc, certain issues will always be
1462 expected, and more of the same will be added over time. For those,
1463 before you do the regen, you can edit
1464 $known_issues
1465 and find the entry for the module's file and specific error message,
1466 and change the count of known potential problems to -1.
1467EOF
1468
1469 note($message);
1470} elsif (%files_with_fixes) {
1471 note(<<EOF
1472To teach this test script that the potential problems have been fixed,
1473$how_to
1474EOF
1475 );
1476}
1477
1478if ($regen) {
1479 chdir $original_dir || die "Can't change directories to $original_dir";
1480 close_and_rename($copy_fh);
1481}