4 Pod::Simple - framework for parsing Pod
12 Pod::Simple is a Perl library for parsing text in the Pod ("plain old
13 documentation") markup language that is typically used for writing
14 documentation for Perl and for Perl modules. The Pod format is explained
15 L<perlpod>; the most common formatter is called C<perldoc>.
17 Be sure to read L</ENCODING> if your Pod contains non-ASCII characters.
19 Pod formatters can use Pod::Simple to parse Pod documents and render them into
20 plain text, HTML, or any number of other formats. Typically, such formatters
21 will be subclasses of Pod::Simple, and so they will inherit its methods, like
24 If you're reading this document just because you have a Pod-processing
25 subclass that you want to use, this document (plus the documentation for the
26 subclass) is probably all you need to read.
28 If you're reading this document because you want to write a formatter
29 subclass, continue reading it and then read L<Pod::Simple::Subclassing>, and
30 then possibly even read L<perlpodspec> (some of which is for parser-writers,
31 but much of which is notes to formatter-writers).
37 =item C<< $parser = I<SomeClass>->new(); >>
39 This returns a new parser object, where I<C<SomeClass>> is a subclass
42 =item C<< $parser->output_fh( *OUT ); >>
44 This sets the filehandle that C<$parser>'s output will be written to.
45 You can pass C<*STDOUT>, otherwise you should probably do something
48 my $outfile = "output.txt";
49 open TXTOUT, ">$outfile" or die "Can't write to $outfile: $!";
50 $parser->output_fh(*TXTOUT);
52 ...before you call one of the C<< $parser->parse_I<whatever> >> methods.
54 =item C<< $parser->output_string( \$somestring ); >>
56 This sets the string that C<$parser>'s output will be sent to,
57 instead of any filehandle.
60 =item C<< $parser->parse_file( I<$some_filename> ); >>
62 =item C<< $parser->parse_file( *INPUT_FH ); >>
64 This reads the Pod content of the file (or filehandle) that you specify,
65 and processes it with that C<$parser> object, according to however
66 C<$parser>'s class works, and according to whatever parser options you
67 have set up for this C<$parser> object.
69 =item C<< $parser->parse_string_document( I<$all_content> ); >>
71 This works just like C<parse_file> except that it reads the Pod
72 content not from a file, but from a string that you have already
75 =item C<< $parser->parse_lines( I<...@lines...>, undef ); >>
77 This processes the lines in C<@lines> (where each list item must be a
78 defined value, and must contain exactly one line of content -- so no
79 items like C<"foo\nbar"> are allowed). The final C<undef> is used to
80 indicate the end of document being parsed.
82 The other C<parser_I<whatever>> methods are meant to be called only once
83 per C<$parser> object; but C<parse_lines> can be called as many times per
84 C<$parser> object as you want, as long as the last call (and only
85 the last call) ends with an C<undef> value.
88 =item C<< $parser->content_seen >>
90 This returns true only if there has been any real content seen for this
91 document. Returns false in cases where the document contains content,
92 but does not make use of any Pod markup.
94 =item C<< I<SomeClass>->filter( I<$filename> ); >>
96 =item C<< I<SomeClass>->filter( I<*INPUT_FH> ); >>
98 =item C<< I<SomeClass>->filter( I<\$document_content> ); >>
100 This is a shortcut method for creating a new parser object, setting the
101 output handle to STDOUT, and then processing the specified file (or
102 filehandle, or in-memory document). This is handy for one-liners like
105 perl -MPod::Simple::Text -e "Pod::Simple::Text->filter('thingy.pod')"
111 =head1 SECONDARY METHODS
113 Some of these methods might be of interest to general users, as
114 well as of interest to formatter-writers.
116 Note that the general pattern here is that the accessor-methods
117 read the attribute's value with C<< $value = $parser->I<attribute> >>
118 and set the attribute's value with
119 C<< $parser->I<attribute>(I<newvalue>) >>. For each accessor, I typically
120 only mention one syntax or another, based on which I think you are actually
126 =item C<< $parser->parse_characters( I<SOMEVALUE> ) >>
128 The Pod parser normally expects to read octets and to convert those octets
129 to characters based on the C<=encoding> declaration in the Pod source. Set
130 this option to a true value to indicate that the Pod source is already a Perl
131 character stream. This tells the parser to ignore any C<=encoding> command
132 and to skip all the code paths involving decoding octets.
134 =item C<< $parser->no_whining( I<SOMEVALUE> ) >>
136 If you set this attribute to a true value, you will suppress the
137 parser's complaints about irregularities in the Pod coding. By default,
138 this attribute's value is false, meaning that irregularities will
141 Note that turning this attribute to true won't suppress one or two kinds
142 of complaints about rarely occurring unrecoverable errors.
145 =item C<< $parser->no_errata_section( I<SOMEVALUE> ) >>
147 If you set this attribute to a true value, you will stop the parser from
148 generating a "POD ERRORS" section at the end of the document. By
149 default, this attribute's value is false, meaning that an errata section
150 will be generated, as necessary.
153 =item C<< $parser->complain_stderr( I<SOMEVALUE> ) >>
155 If you set this attribute to a true value, it will send reports of
156 parsing errors to STDERR. By default, this attribute's value is false,
157 meaning that no output is sent to STDERR.
159 Setting C<complain_stderr> also sets C<no_errata_section>.
162 =item C<< $parser->source_filename >>
164 This returns the filename that this parser object was set to read from.
167 =item C<< $parser->doc_has_started >>
169 This returns true if C<$parser> has read from a source, and has seen
173 =item C<< $parser->source_dead >>
175 This returns true if C<$parser> has read from a source, and come to the
178 =item C<< $parser->strip_verbatim_indent( I<SOMEVALUE> ) >>
180 The perlpod spec for a Verbatim paragraph is "It should be reproduced
181 exactly...", which means that the whitespace you've used to indent your
182 verbatim blocks will be preserved in the output. This can be annoying for
183 outputs such as HTML, where that whitespace will remain in front of every
184 line. It's an unfortunate case where syntax is turned into semantics.
186 If the POD your parsing adheres to a consistent indentation policy, you can
187 have such indentation stripped from the beginning of every line of your
188 verbatim blocks. This method tells Pod::Simple what to strip. For two-space
191 $parser->strip_verbatim_indent(' ');
193 For tab indents, you'd use a tab character:
195 $parser->strip_verbatim_indent("\t");
197 If the POD is inconsistent about the indentation of verbatim blocks, but you
198 have figured out a heuristic to determine how much a particular verbatim block
199 is indented, you can pass a code reference instead. The code reference will be
200 executed with one argument, an array reference of all the lines in the
201 verbatim block, and should return the value to be stripped from each line. For
202 example, if you decide that you're fine to use the first line of the verbatim
203 block to set the standard for indentation of the rest of the block, you can
204 look at the first line and return the appropriate value, like so:
206 $new->strip_verbatim_indent(sub {
208 (my $indent = $lines->[0]) =~ s/\S.*//;
212 If you'd rather treat each line individually, you can do that, too, by just
213 transforming them in-place in the code reference and returning C<undef>. Say
214 that you don't want I<any> lines indented. You can do something like this:
216 $new->strip_verbatim_indent(sub {
218 sub { s/^\s+// for @{ $lines },
224 =head1 TERTIARY METHODS
228 =item C<< $parser->abandon_output_fh() >>X<abandon_output_fh>
230 Cancel output to the file handle. Any POD read by the C<$parser> is not
233 =item C<< $parser->abandon_output_string() >>X<abandon_output_string>
235 Cancel output to the output string. Any POD read by the C<$parser> is not
238 =item C<< $parser->accept_code( @codes ) >>X<accept_code>
240 Alias for L<< accept_codes >>.
242 =item C<< $parser->accept_codes( @codes ) >>X<accept_codes>
244 Allows C<$parser> to accept a list of L<perlpod/Formatting Codes>. This can be
245 used to implement user-defined codes.
247 =item C<< $parser->accept_directive_as_data( @directives ) >>X<accept_directive_as_data>
249 Allows C<$parser> to accept a list of directives for data paragraphs. A
250 directive is the label of a L<perlpod/Command Paragraph>. A data paragraph is
251 one delimited by C<< =begin/=for/=end >> directives. This can be used to
252 implement user-defined directives.
254 =item C<< $parser->accept_directive_as_processed( @directives ) >>X<accept_directive_as_processed>
256 Allows C<$parser> to accept a list of directives for processed paragraphs. A
257 directive is the label of a L<perlpod/Command Paragraph>. A processed
258 paragraph is also known as L<perlpod/Ordinary Paragraph>. This can be used to
259 implement user-defined directives.
261 =item C<< $parser->accept_directive_as_verbatim( @directives ) >>X<accept_directive_as_verbatim>
263 Allows C<$parser> to accept a list of directives for L<perlpod/Verbatim
264 Paragraph>. A directive is the label of a L<perlpod/Command Paragraph>. This
265 can be used to implement user-defined directives.
267 =item C<< $parser->accept_target( @targets ) >>X<accept_target>
269 Alias for L<< accept_targets >>.
271 =item C<< $parser->accept_target_as_text( @targets ) >>X<accept_target_as_text>
273 Alias for L<< accept_targets_as_text >>.
275 =item C<< $parser->accept_targets( @targets ) >>X<accept_targets>
277 Accepts targets for C<< =begin/=for/=end >> sections of the POD.
279 =item C<< $parser->accept_targets_as_text( @targets ) >>X<accept_targets_as_text>
281 Accepts targets for C<< =begin/=for/=end >> sections that should be parsed as
282 POD. For details, see L<< perlpodspec/About Data Paragraphs >>.
284 =item C<< $parser->any_errata_seen() >>X<any_errata_seen>
286 Used to check if any errata was seen.
290 die "too many errors\n" if $parser->any_errata_seen();
292 =item C<< $parser->detected_encoding() >>X<detected_encoding>
294 Return the encoding corresponding to C<< =encoding >>, but only if the
295 encoding was recognized and handled.
297 =item C<< $parser->encoding() >>X<encoding>
299 Return encoding of the document, even if the encoding is not correctly
302 =item C<< $parser->parse_from_file( $source, $to ) >>X<parse_from_file>
304 Parses from C<$source> file to C<$to> file. Similar to L<<
305 Pod::Parser/parse_from_file >>.
307 =item C<< $parser->scream( @error_messages ) >>X<scream>
309 Log an error that can't be ignored.
311 =item C<< $parser->unaccept_code( @codes ) >>X<unaccept_code>
313 Alias for L<< unaccept_codes >>.
315 =item C<< $parser->unaccept_codes( @codes ) >>X<unaccept_codes>
317 Removes C<< @codes >> as valid codes for the parse.
319 =item C<< $parser->unaccept_directive( @directives ) >>X<unaccept_directive>
321 Alias for L<< unaccept_directives >>.
323 =item C<< $parser->unaccept_directives( @directives ) >>X<unaccept_directives>
325 Removes C<< @directives >> as valid directives for the parse.
327 =item C<< $parser->unaccept_target( @targets ) >>X<unaccept_target>
329 Alias for L<< unaccept_targets >>.
331 =item C<< $parser->unaccept_targets( @targets ) >>X<unaccept_targets>
333 Removes C<< @targets >> as valid targets for the parse.
335 =item C<< $parser->version_report() >>X<version_report>
337 Returns a string describing the version.
339 =item C<< $parser->whine( @error_messages ) >>X<whine>
341 Log an error unless C<< $parser->no_whining( TRUE ); >>.
347 The Pod::Simple parser expects to read B<octets>. The parser will decode the
348 octets into Perl's internal character string representation using the value of
349 the C<=encoding> declaration in the POD source.
351 If the POD source does not include an C<=encoding> declaration, the parser will
352 attempt to guess the encoding (selecting one of UTF-8 or Latin-1) by examining
353 the first non-ASCII bytes and applying the heuristic described in
356 If you set the C<parse_characters> option to a true value the parser will
357 expect characters rather than octets; will ignore any C<=encoding>; and will
358 make no attempt to decode the input.
362 This is just a beta release -- there are a good number of things still
363 left to do. Notably, support for EBCDIC platforms is still half-done,
369 L<Pod::Simple::Subclassing>
373 L<perlpodspec|perlpodspec>
375 L<Pod::Escapes|Pod::Escapes>
381 Questions or discussion about POD and Pod::Simple should be sent to the
382 pod-people@perl.org mail list. Send an empty email to
383 pod-people-subscribe@perl.org to subscribe.
385 This module is managed in an open GitHub repository,
386 L<https://github.com/theory/pod-simple/>. Feel free to fork and contribute, or
387 to clone L<git://github.com/theory/pod-simple.git> and send patches!
389 Patches against Pod::Simple are welcome. Please send bug reports to
390 <bug-pod-simple@rt.cpan.org>.
392 =head1 COPYRIGHT AND DISCLAIMERS
394 Copyright (c) 2002 Sean M. Burke.
396 This library is free software; you can redistribute it and/or modify it
397 under the same terms as Perl itself.
399 This program is distributed in the hope that it will be useful, but
400 without any warranty; without even the implied warranty of
401 merchantability or fitness for a particular purpose.
405 Pod::Simple was created by Sean M. Burke <sburke@cpan.org>.
406 But don't bother him, he's retired.
408 Pod::Simple is maintained by:
412 =item * Allison Randal C<allison@perl.org>
414 =item * Hans Dieter Pearcey C<hdp@cpan.org>
416 =item * David E. Wheeler C<dwheeler@cpan.org>
420 Documentation has been contributed by:
424 =item * Gabor Szabo C<szabgab@gmail.com>
426 =item * Shawn H Corey C<SHCOREY at cpan.org>