cpan/Pod-Simple/lib/Pod/Simple.pod

   1
   2 =head1 NAME
   3
   4 Pod::Simple - framework for parsing Pod
   5
   6 =head1 SYNOPSIS
   7
   8  TODO
   9
  10 =head1 DESCRIPTION
  11
  12 Pod::Simple is a Perl library for parsing text in the Pod ("plain old
  13 documentation") markup language that is typically used for writing
  14 documentation for Perl and for Perl modules. The Pod format is explained
  15 L<perlpod>; the most common formatter is called C<perldoc>.
  16
  17 Be sure to read L</ENCODING> if your Pod contains non-ASCII characters.
  18
  19 Pod formatters can use Pod::Simple to parse Pod documents and render them into
  20 plain text, HTML, or any number of other formats. Typically, such formatters
  21 will be subclasses of Pod::Simple, and so they will inherit its methods, like
  22 C<parse_file>.
  23
  24 If you're reading this document just because you have a Pod-processing
  25 subclass that you want to use, this document (plus the documentation for the
  26 subclass) is probably all you need to read.
  27
  28 If you're reading this document because you want to write a formatter
  29 subclass, continue reading it and then read L<Pod::Simple::Subclassing>, and
  30 then possibly even read L<perlpodspec> (some of which is for parser-writers,
  31 but much of which is notes to formatter-writers).
  32
  33 =head1 MAIN METHODS
  34
  35 =over
  36
  37 =item C<< $parser = I<SomeClass>->new(); >>
  38
  39 This returns a new parser object, where I<C<SomeClass>> is a subclass
  40 of Pod::Simple.
  41
  42 =item C<< $parser->output_fh( *OUT ); >>
  43
  44 This sets the filehandle that C<$parser>'s output will be written to.
  45 You can pass C<*STDOUT>, otherwise you should probably do something
  46 like this:
  47
  48     my $outfile = "output.txt";
  49     open TXTOUT, ">$outfile" or die "Can't write to $outfile: $!";
  50     $parser->output_fh(*TXTOUT);
  51
  52 ...before you call one of the C<< $parser->parse_I<whatever> >> methods.
  53
  54 =item C<< $parser->output_string( \$somestring ); >>
  55
  56 This sets the string that C<$parser>'s output will be sent to,
  57 instead of any filehandle.
  58
  59
  60 =item C<< $parser->parse_file( I<$some_filename> ); >>
  61
  62 =item C<< $parser->parse_file( *INPUT_FH ); >>
  63
  64 This reads the Pod content of the file (or filehandle) that you specify,
  65 and processes it with that C<$parser> object, according to however
  66 C<$parser>'s class works, and according to whatever parser options you
  67 have set up for this C<$parser> object.
  68
  69 =item C<< $parser->parse_string_document( I<$all_content> ); >>
  70
  71 This works just like C<parse_file> except that it reads the Pod
  72 content not from a file, but from a string that you have already
  73 in memory.
  74
  75 =item C<< $parser->parse_lines( I<...@lines...>, undef ); >>
  76
  77 This processes the lines in C<@lines> (where each list item must be a
  78 defined value, and must contain exactly one line of content -- so no
  79 items like C<"foo\nbar"> are allowed).  The final C<undef> is used to
  80 indicate the end of document being parsed.
  81
  82 The other C<parser_I<whatever>> methods are meant to be called only once
  83 per C<$parser> object; but C<parse_lines> can be called as many times per
  84 C<$parser> object as you want, as long as the last call (and only
  85 the last call) ends with an C<undef> value.
  86
  87
  88 =item C<< $parser->content_seen >>
  89
  90 This returns true only if there has been any real content seen for this
  91 document. Returns false in cases where the document contains content,
  92 but does not make use of any Pod markup.
  93
  94 =item C<< I<SomeClass>->filter( I<$filename> ); >>
  95
  96 =item C<< I<SomeClass>->filter( I<*INPUT_FH> ); >>
  97
  98 =item C<< I<SomeClass>->filter( I<\$document_content> ); >>
  99
 100 This is a shortcut method for creating a new parser object, setting the
 101 output handle to STDOUT, and then processing the specified file (or
 102 filehandle, or in-memory document). This is handy for one-liners like
 103 this:
 104
 105   perl -MPod::Simple::Text -e "Pod::Simple::Text->filter('thingy.pod')"
 106
 107 =back
 108
 109
 110
 111 =head1 SECONDARY METHODS
 112
 113 Some of these methods might be of interest to general users, as
 114 well as of interest to formatter-writers.
 115
 116 Note that the general pattern here is that the accessor-methods
 117 read the attribute's value with C<< $value = $parser->I<attribute> >>
 118 and set the attribute's value with
 119 C<< $parser->I<attribute>(I<newvalue>) >>.  For each accessor, I typically
 120 only mention one syntax or another, based on which I think you are actually
 121 most likely to use.
 122
 123
 124 =over
 125
 126 =item C<< $parser->parse_characters( I<SOMEVALUE> ) >>
 127
 128 The Pod parser normally expects to read octets and to convert those octets
 129 to characters based on the C<=encoding> declaration in the Pod source.  Set
 130 this option to a true value to indicate that the Pod source is already a Perl
 131 character stream.  This tells the parser to ignore any C<=encoding> command
 132 and to skip all the code paths involving decoding octets.
 133
 134 =item C<< $parser->no_whining( I<SOMEVALUE> ) >>
 135
 136 If you set this attribute to a true value, you will suppress the
 137 parser's complaints about irregularities in the Pod coding. By default,
 138 this attribute's value is false, meaning that irregularities will
 139 be reported.
 140
 141 Note that turning this attribute to true won't suppress one or two kinds
 142 of complaints about rarely occurring unrecoverable errors.
 143
 144
 145 =item C<< $parser->no_errata_section( I<SOMEVALUE> ) >>
 146
 147 If you set this attribute to a true value, you will stop the parser from
 148 generating a "POD ERRORS" section at the end of the document. By
 149 default, this attribute's value is false, meaning that an errata section
 150 will be generated, as necessary.
 151
 152
 153 =item C<< $parser->complain_stderr( I<SOMEVALUE> ) >>
 154
 155 If you set this attribute to a true value, it will send reports of
 156 parsing errors to STDERR. By default, this attribute's value is false,
 157 meaning that no output is sent to STDERR.
 158
 159 Setting C<complain_stderr> also sets C<no_errata_section>.
 160
 161
 162 =item C<< $parser->source_filename >>
 163
 164 This returns the filename that this parser object was set to read from.
 165
 166
 167 =item C<< $parser->doc_has_started >>
 168
 169 This returns true if C<$parser> has read from a source, and has seen
 170 Pod content in it.
 171
 172
 173 =item C<< $parser->source_dead >>
 174
 175 This returns true if C<$parser> has read from a source, and come to the
 176 end of that source.
 177
 178 =item C<< $parser->strip_verbatim_indent( I<SOMEVALUE> ) >>
 179
 180 The perlpod spec for a Verbatim paragraph is "It should be reproduced
 181 exactly...", which means that the whitespace you've used to indent your
 182 verbatim blocks will be preserved in the output. This can be annoying for
 183 outputs such as HTML, where that whitespace will remain in front of every
 184 line. It's an unfortunate case where syntax is turned into semantics.
 185
 186 If the POD your parsing adheres to a consistent indentation policy, you can
 187 have such indentation stripped from the beginning of every line of your
 188 verbatim blocks. This method tells Pod::Simple what to strip. For two-space
 189 indents, you'd use:
 190
 191   $parser->strip_verbatim_indent('  ');
 192
 193 For tab indents, you'd use a tab character:
 194
 195   $parser->strip_verbatim_indent("\t");
 196
 197 If the POD is inconsistent about the indentation of verbatim blocks, but you
 198 have figured out a heuristic to determine how much a particular verbatim block
 199 is indented, you can pass a code reference instead. The code reference will be
 200 executed with one argument, an array reference of all the lines in the
 201 verbatim block, and should return the value to be stripped from each line. For
 202 example, if you decide that you're fine to use the first line of the verbatim
 203 block to set the standard for indentation of the rest of the block, you can
 204 look at the first line and return the appropriate value, like so:
 205
 206   $new->strip_verbatim_indent(sub {
 207       my $lines = shift;
 208       (my $indent = $lines->[0]) =~ s/\S.*//;
 209       return $indent;
 210   });
 211
 212 If you'd rather treat each line individually, you can do that, too, by just
 213 transforming them in-place in the code reference and returning C<undef>. Say
 214 that you don't want I<any> lines indented. You can do something like this:
 215
 216   $new->strip_verbatim_indent(sub {
 217       my $lines = shift;
 218       sub { s/^\s+// for @{ $lines },
 219       return undef;
 220   });
 221
 222 =back
 223
 224 =head1 TERTIARY METHODS
 225
 226 =over
 227
 228 =item C<< $parser->abandon_output_fh() >>X<abandon_output_fh>
 229
 230 Cancel output to the file handle. Any POD read by the C<$parser> is not
 231 effected.
 232
 233 =item C<< $parser->abandon_output_string() >>X<abandon_output_string>
 234
 235 Cancel output to the output string. Any POD read by the C<$parser> is not
 236 effected.
 237
 238 =item C<< $parser->accept_code( @codes ) >>X<accept_code>
 239
 240 Alias for L<< accept_codes >>.
 241
 242 =item C<< $parser->accept_codes( @codes ) >>X<accept_codes>
 243
 244 Allows C<$parser> to accept a list of L<perlpod/Formatting Codes>. This can be
 245 used to implement user-defined codes.
 246
 247 =item C<< $parser->accept_directive_as_data( @directives ) >>X<accept_directive_as_data>
 248
 249 Allows C<$parser> to accept a list of directives for data paragraphs. A
 250 directive is the label of a L<perlpod/Command Paragraph>. A data paragraph is
 251 one delimited by C<< =begin/=for/=end >> directives. This can be used to
 252 implement user-defined directives.
 253
 254 =item C<< $parser->accept_directive_as_processed( @directives ) >>X<accept_directive_as_processed>
 255
 256 Allows C<$parser> to accept a list of directives for processed paragraphs. A
 257 directive is the label of a L<perlpod/Command Paragraph>. A processed
 258 paragraph is also known as L<perlpod/Ordinary Paragraph>. This can be used to
 259 implement user-defined directives.
 260
 261 =item C<< $parser->accept_directive_as_verbatim( @directives ) >>X<accept_directive_as_verbatim>
 262
 263 Allows C<$parser> to accept a list of directives for L<perlpod/Verbatim
 264 Paragraph>. A directive is the label of a L<perlpod/Command Paragraph>. This
 265 can be used to implement user-defined directives.
 266
 267 =item C<< $parser->accept_target( @targets ) >>X<accept_target>
 268
 269 Alias for L<< accept_targets >>.
 270
 271 =item C<< $parser->accept_target_as_text( @targets ) >>X<accept_target_as_text>
 272
 273 Alias for L<< accept_targets_as_text >>.
 274
 275 =item C<< $parser->accept_targets( @targets ) >>X<accept_targets>
 276
 277 Accepts targets for C<< =begin/=for/=end >> sections of the POD.
 278
 279 =item C<< $parser->accept_targets_as_text( @targets ) >>X<accept_targets_as_text>
 280
 281 Accepts targets for C<< =begin/=for/=end >> sections that should be parsed as
 282 POD. For details, see L<< perlpodspec/About Data Paragraphs >>.
 283
 284 =item C<< $parser->any_errata_seen() >>X<any_errata_seen>
 285
 286 Used to check if any errata was seen.
 287
 288 I<Example:>
 289
 290   die "too many errors\n" if $parser->any_errata_seen();
 291
 292 =item C<< $parser->detected_encoding() >>X<detected_encoding>
 293
 294 Return the encoding corresponding to C<< =encoding >>, but only if the
 295 encoding was recognized and handled.
 296
 297 =item C<< $parser->encoding() >>X<encoding>
 298
 299 Return encoding of the document, even if the encoding is not correctly
 300 handled.
 301
 302 =item C<< $parser->parse_from_file( $source, $to ) >>X<parse_from_file>
 303
 304 Parses from C<$source> file to C<$to> file. Similar to L<<
 305 Pod::Parser/parse_from_file >>.
 306
 307 =item C<< $parser->scream( @error_messages ) >>X<scream>
 308
 309 Log an error that can't be ignored.
 310
 311 =item C<< $parser->unaccept_code( @codes ) >>X<unaccept_code>
 312
 313 Alias for L<< unaccept_codes >>.
 314
 315 =item C<< $parser->unaccept_codes( @codes ) >>X<unaccept_codes>
 316
 317 Removes C<< @codes >> as valid codes for the parse.
 318
 319 =item C<< $parser->unaccept_directive( @directives ) >>X<unaccept_directive>
 320
 321 Alias for L<< unaccept_directives >>.
 322
 323 =item C<< $parser->unaccept_directives( @directives ) >>X<unaccept_directives>
 324
 325 Removes C<< @directives >> as valid directives for the parse.
 326
 327 =item C<< $parser->unaccept_target( @targets ) >>X<unaccept_target>
 328
 329 Alias for L<< unaccept_targets >>.
 330
 331 =item C<< $parser->unaccept_targets( @targets ) >>X<unaccept_targets>
 332
 333 Removes C<< @targets >> as valid targets for the parse.
 334
 335 =item C<< $parser->version_report() >>X<version_report>
 336
 337 Returns a string describing the version.
 338
 339 =item C<< $parser->whine( @error_messages ) >>X<whine>
 340
 341 Log an error unless C<< $parser->no_whining( TRUE ); >>.
 342
 343 =back
 344
 345 =head1 ENCODING
 346
 347 The Pod::Simple parser expects to read B<octets>.  The parser will decode the
 348 octets into Perl's internal character string representation using the value of
 349 the C<=encoding> declaration in the POD source.
 350
 351 If the POD source does not include an C<=encoding> declaration, the parser will
 352 attempt to guess the encoding (selecting one of UTF-8 or Latin-1) by examining
 353 the first non-ASCII bytes and applying the heuristic described in
 354 L<perlpodspec>.
 355
 356 If you set the C<parse_characters> option to a true value the parser will
 357 expect characters rather than octets; will ignore any C<=encoding>; and will
 358 make no attempt to decode the input.
 359
 360 =head1 CAVEATS
 361
 362 This is just a beta release -- there are a good number of things still
 363 left to do.  Notably, support for EBCDIC platforms is still half-done,
 364 an untested.
 365
 366
 367 =head1 SEE ALSO
 368
 369 L<Pod::Simple::Subclassing>
 370
 371 L<perlpod|perlpod>
 372
 373 L<perlpodspec|perlpodspec>
 374
 375 L<Pod::Escapes|Pod::Escapes>
 376
 377 L<perldoc>
 378
 379 =head1 SUPPORT
 380
 381 Questions or discussion about POD and Pod::Simple should be sent to the
 382 pod-people@perl.org mail list. Send an empty email to
 383 pod-people-subscribe@perl.org to subscribe.
 384
 385 This module is managed in an open GitHub repository,
 386 L<https://github.com/theory/pod-simple/>. Feel free to fork and contribute, or
 387 to clone L<git://github.com/theory/pod-simple.git> and send patches!
 388
 389 Patches against Pod::Simple are welcome. Please send bug reports to
 390 <bug-pod-simple@rt.cpan.org>.
 391
 392 =head1 COPYRIGHT AND DISCLAIMERS
 393
 394 Copyright (c) 2002 Sean M. Burke.
 395
 396 This library is free software; you can redistribute it and/or modify it
 397 under the same terms as Perl itself.
 398
 399 This program is distributed in the hope that it will be useful, but
 400 without any warranty; without even the implied warranty of
 401 merchantability or fitness for a particular purpose.
 402
 403 =head1 AUTHOR
 404
 405 Pod::Simple was created by Sean M. Burke <sburke@cpan.org>.
 406 But don't bother him, he's retired.
 407
 408 Pod::Simple is maintained by:
 409
 410 =over
 411
 412 =item * Allison Randal C<allison@perl.org>
 413
 414 =item * Hans Dieter Pearcey C<hdp@cpan.org>
 415
 416 =item * David E. Wheeler C<dwheeler@cpan.org>
 417
 418 =back
 419
 420 Documentation has been contributed by:
 421
 422 =over
 423
 424 =item * Gabor Szabo C<szabgab@gmail.com>
 425
 426 =item * Shawn H Corey  C<SHCOREY at cpan.org>
 427
 428 =back
 429
 430 =cut