use strict;
sub _handle_element_start {
- my($parser, $element_name, $attr_hash_r) = @_;
- ...
+ my($parser, $element_name, $attr_hash_r) = @_;
+ ...
}
sub _handle_element_end {
- my($parser, $element_name, $attr_hash_r) = @_;
- # NOTE: $attr_hash_r is only present when $element_name is "over" or "begin"
- # The remaining code excerpts will mostly ignore this $attr_hash_r, as it is
- # mostly useless. It is documented where "over-*" and "begin" events are
- # documented.
- ...
+ my($parser, $element_name, $attr_hash_r) = @_;
+ # NOTE: $attr_hash_r is only present when $element_name is "over" or "begin"
+ # The remaining code excerpts will mostly ignore this $attr_hash_r, as it is
+ # mostly useless. It is documented where "over-*" and "begin" events are
+ # documented.
+ ...
}
sub _handle_text {
- my($parser, $text) = @_;
- ...
+ my($parser, $text) = @_;
+ ...
}
1;
Parsing a document produces this event structure:
<Document start_line="543">
- ...all events...
+ ...all events...
</Document>
The value of the I<start_line> attribute will be the line number of the first
Pod directive in the document.
-If there is no Pod in the given document, then the
+If there is no Pod in the given document, then the
event structure will be this:
<Document contentless="1" start_line="543">
Parsing a plain (non-verbatim, non-directive, non-data) paragraph in
a Pod document produces this event structure:
- <Para start_line="543">
- ...all events in this paragraph...
- </Para>
+ <Para start_line="543">
+ ...all events in this paragraph...
+ </Para>
The value of the I<start_line> attribute will be the line number of the start
of the paragraph.
produces this event structure:
- <Para start_line="129">
- The value of the
- <I>
- start_line
- </I>
- attribute will be the line number of the first Pod directive
- in the document.
- </Para>
+ <Para start_line="129">
+ The value of the
+ <I>
+ start_line
+ </I>
+ attribute will be the line number of the first Pod directive
+ in the document.
+ </Para>
=item events with an element_name of B, C, F, or I.
or S<BE<lt>E<lt>E<lt>E<lt> ... E<gt>E<gt>E<gt>E<gt>>, etc.)
produces this event structure:
- <B>
- ...stuff...
- </B>
+ <B>
+ ...stuff...
+ </B>
Currently, there are no attributes conveyed.
Normally, parsing an SE<lt>...E<gt> sequence produces this event
structure, just as if it were a B/C/F/I code:
- <S>
- ...stuff...
- </S>
+ <S>
+ ...stuff...
+ </S>
However, Pod::Simple (and presumably all derived parsers) offers the
C<nbsp_for_S> option which, if enabled, will suppress all S events, and
Normally, parsing an XE<lt>...E<gt> sequence produces this event
structure, just as if it were a B/C/F/I code:
- <X>
- ...stuff...
- </X>
+ <X>
+ ...stuff...
+ </X>
However, Pod::Simple (and presumably all derived parsers) offers the
C<nix_X_codes> option which, if enabled, will suppress all X events
structure:
<L content-implicit="yes" raw="that_url" to="that_url" type="url">
- that_url
+ that_url
</L>
The C<type="url"> attribute is always specified for this type of
produces this event structure:
<L content-implicit="yes" raw="http://www.perl.com/CPAN/authors/" to="http://www.perl.com/CPAN/authors/" type="url">
- http://www.perl.com/CPAN/authors/
+ http://www.perl.com/CPAN/authors/
</L>
When a LE<lt>I<manpage(section)>E<gt> code is parsed (and these are
fairly rare and not terribly useful), it produces this event structure:
<L content-implicit="yes" raw="manpage(section)" to="manpage(section)" type="man">
- manpage(section)
+ manpage(section)
</L>
The C<type="man"> attribute is always specified for this type of
produces this event structure:
<L content-implicit="yes" raw="crontab(5)" to="crontab(5)" type="man">
- crontab(5)
+ crontab(5)
</L>
In the rare cases where a man page link has a specified, that text appears
will produce this event structure:
<L content-implicit="yes" raw="crontab(5)/"ENVIRONMENT"" section="ENVIRONMENT" to="crontab(5)" type="man">
- "ENVIRONMENT" in crontab(5)
+ "ENVIRONMENT" in crontab(5)
</L>
In the rare case where the Pod document has code like
will produce this event structure:
<L raw="hell itself!|crontab(5)" to="crontab(5)" type="man">
- hell itself!
+ hell itself!
</L>
The last type of L structure is for links to/within Pod documents. It is
produces this event structure:
<L content-implicit="yes" raw="podpage" to="podpage" type="pod">
- podpage
+ podpage
</L>
For example, this Pod source:
produces this event structure:
<L content-implicit="yes" raw="Net::Ping" to="Net::Ping" type="pod">
- Net::Ping
+ Net::Ping
</L>
In cases where there is link-text explicitly specified, it
produces this event structure:
<L raw="Perl Error Messages|perldiag" to="perldiag" type="pod">
- Perl Error Messages
+ Perl Error Messages
</L>
In cases of links to a section in the current Pod document,
produces this event structure:
<L content-implicit="yes" raw="/"Member Data"" section="Member Data" type="pod">
- "Member Data"
+ "Member Data"
</L>
As another example, this Pod source:
produces this event structure:
<L raw="the various attributes|/"Member Data"" section="Member Data" type="pod">
- the various attributes
+ the various attributes
</L>
In cases of links to a section in a different Pod document,
produces this event structure:
<L content-implicit="yes" raw="perlsyn/"Basic BLOCKs and Switch Statements"" section="Basic BLOCKs and Switch Statements" to="perlsyn" type="pod">
- "Basic BLOCKs and Switch Statements" in perlsyn
+ "Basic BLOCKs and Switch Statements" in perlsyn
</L>
As another example, this Pod source:
produces this event structure:
<L raw="SWITCH statements|perlsyn/"Basic BLOCKs and Switch Statements"" section="Basic BLOCKs and Switch Statements" to="perlsyn" type="pod">
- SWITCH statements
+ SWITCH statements
</L>
Incidentally, note that we do not distinguish between these syntaxes:
That is, they all produce the same event structure (for the most part), namely:
<L content-implicit="yes" raw="$depends_on_syntax" section="Member Data" type="pod">
- "Member Data"
+ "Member Data"
</L>
The I<raw> attribute depends on what the raw content of the C<LE<lt>E<gt>> is,
L<click B<here>|page/About the C<-M> switch>
<L raw="click B<here>|page/About the C<-M> switch" section="About the -M switch" to="page" type="pod">
- click B<here>
+ click B<here>
</L>
Specifically, notice that the formatting codes are present and unescaped
produces this event structure:
<Verbatim start_line="543" xml:space="preserve">
- ...text...
+ ...text...
</Verbatim>
The value of the I<start_line> attribute will be the line number of the
structure:
<head1>
- ...stuff...
+ ...stuff...
</head1>
For example, a directive consisting of this:
will produce this event structure:
<head1 start_line="543">
- Options to
- <C>
- new
- </C>
- et al.
+ Options to
+ <C>
+ new
+ </C>
+ et al.
</head1>
"=head2" thru "=head4" directives are the same, except for the element
names in the event structure.
+=item events with an element_name of encoding
+
+In the default case, the events corresponding to C<=encoding> directives
+are not emitted. They are emitted if C<keep_encoding_directive> is true.
+In that case they produce event structures like
+L</"events with an element_name of head1 .. head4"> above.
+
=item events with an element_name of over-bullet
When an "=over ... Z<>=back" block is parsed where the items are
a bulleted list, it will produce this event structure:
<over-bullet indent="4" start_line="543">
- <item-bullet start_line="545">
- ...Stuff...
- </item-bullet>
- ...more item-bullets...
+ <item-bullet start_line="545">
+ ...Stuff...
+ </item-bullet>
+ ...more item-bullets...
</over-bullet fake-closer="1">
The attribute I<fake-closer> is only present if it is a true value; it is not
produces this event structure:
<over-bullet indent="4" start_line="10">
- <item-bullet start_line="12">
- Stuff
- </item-bullet>
- <item-bullet start_line="14">
- Bar <I>baz</I>!
- </item-bullet>
+ <item-bullet start_line="12">
+ Stuff
+ </item-bullet>
+ <item-bullet start_line="14">
+ Bar <I>baz</I>!
+ </item-bullet>
</over-bullet>
=item events with an element_name of over-number
a numbered list, it will produce this event structure:
<over-number indent="4" start_line="543">
- <item-number number="1" start_line="545">
- ...Stuff...
- </item-number>
- ...more item-number...
+ <item-number number="1" start_line="545">
+ ...Stuff...
+ </item-number>
+ ...more item-number...
</over-bullet>
This is like the "over-bullet" event structure; but note that the contents
a list of text "subheadings", it will produce this event structure:
<over-text indent="4" start_line="543">
- <item-text>
- ...stuff...
- </item-text>
- ...stuff (generally Para or Verbatim elements)...
- <item-text>
- ...more item-text and/or stuff...
+ <item-text>
+ ...stuff...
+ </item-text>
+ ...stuff (generally Para or Verbatim elements)...
+ <item-text>
+ ...more item-text and/or stuff...
</over-text>
The I<indent> and I<fake-closer> attributes are as with the other over-* events.
produces this event structure:
<over-text indent="4" start_line="20">
- <item-text start_line="22">
- Foo
- </item-text>
- <Para start_line="24">
- Stuff
- </Para>
- <item-text start_line="26">
- Bar
- <I>
- baz
- </I>
- !
- </item-text>
- <Para start_line="28">
- Quux
- </Para>
+ <item-text start_line="22">
+ Foo
+ </item-text>
+ <Para start_line="24">
+ Stuff
+ </Para>
+ <item-text start_line="26">
+ Bar
+ <I>
+ baz
+ </I>
+ !
+ </item-text>
+ <Para start_line="28">
+ Quux
+ </Para>
</over-text>
it will produce this event structure:
<over-block indent="4" start_line="543">
- ...stuff (generally Para or Verbatim elements)...
+ ...stuff (generally Para or Verbatim elements)...
</over-block>
The I<indent> and I<fake-closer> attributes are as with the other over-* events.
will produce this event structure:
<over-block indent="4" start_line="2">
- <Para start_line="4">
- For cutting off our trade with all parts of the world
- </Para>
- <Para start_line="6">
- For transporting us beyond seas to be tried for pretended offenses
- </Para>
- <Para start_line="8">
- He is at this time transporting large armies of [...more text...]
- </Para>
+ <Para start_line="4">
+ For cutting off our trade with all parts of the world
+ </Para>
+ <Para start_line="6">
+ For transporting us beyond seas to be tried for pretended offenses
+ </Para>
+ <Para start_line="8">
+ He is at this time transporting large armies of [...more text...]
+ </Para>
</over-block>
=item events with an element_name of over-empty
will produce this event structure:
<over-block indent="4" start_line="1">
- <over-empty indent="4" start_line="3">
- </over-empty>
+ <over-empty indent="4" start_line="3">
+ </over-empty>
</over-block>
Note that the outer C<=over> is a block because it has no C<=item>s but still
-has content: the inner C<=over>. The inner C<=over>, in turn, is completely
+has content: the inner C<=over>. The inner C<=over>, in turn, is completely
empty, and is treated as such.
=item events with an element_name of item-bullet
As the parser sees sections like:
- =for html <img src="fig1.jpg">
+ =for html <img src="fig1.jpg">
or
- =begin html
+ =begin html
- <img src="fig1.jpg">
+ <img src="fig1.jpg">
- =end html
+ =end html
...the parser will ignore these sections unless your subclass has
specified that it wants to see sections targeted to "html" (or whatever
you don't actually see in the parse tree, Z and E). For example, to also
accept codes "N", "R", and "W":
- $parser->accept_codes( qw( N R W ) );
+ $parser->accept_codes( qw( N R W ) );
B<TODO: document how this interacts with =extend, and long element names>
For example, to accept a new directive "=method", you'd presumably
use:
- $parser->accept_directive_as_processed("method");
+ $parser->accept_directive_as_processed("method");
so that you could have Pod lines like:
- =method I<$whatever> thing B<um>
+ =method I<$whatever> thing B<um>
Making up your own directives breaks compatibility with other Pod
formatters, in a way that using "=for I<target> ..." lines doesn't;
turn "SE<lt>...E<gt>" sequences into sequences of words separated by
C<\xA0> (non-breaking space) characters. For example, it will take this:
- I like S<Dutch apple pie>, don't you?
+ I like S<Dutch apple pie>, don't you?
and treat it as if it were:
- I like DutchE<nbsp>appleE<nbsp>pie, don't you?
+ I like DutchE<nbsp>appleE<nbsp>pie, don't you?
This is handy for output formats that don't have anything quite like an
"SE<lt>...E<gt>" code, but which do have a code for non-breaking space.
this detail in a comment in the output format. For example, for
some kind of SGML output format:
- print OUT "<!-- \n", $parser->version_report, "\n -->";
+ print OUT "<!-- \n", $parser->version_report, "\n -->";
=item C<< $parser->pod_para_count() >>
Many formats don't actually use the content of these codes, so have
no reason to process them.
+=item C<< $parser->keep_encoding_directive( I<SOMEVALUE> ) >>
+
+This attribute, when set to a true value (it is false by default)
+will keep C<=encoding> and its content in the event structure. Most
+formats don't actually need to process the content of an C<=encoding>
+directive, even when this directive sets the encoding and the
+processor makes use of the encoding information. Indeed, it is
+possible to know the encoding without processing the directive
+content.
=item C<< $parser->merge_text( I<SOMEVALUE> ) >>
for any single contiguous sequence of text. For example, consider
this somewhat contrived example:
- I just LOVE Z<>hotE<32>apple pie!
+ I just LOVE Z<>hotE<32>apple pie!
When that is parsed and events are about to be called on it, it may
actually seem to be four different text events, one right after another:
that no code should be called. If you provide a routine, it should
start out like this:
- sub get_code_line { # or whatever you'll call it
- my($line, $line_number, $parser) = @_;
- ...
- }
+ sub get_code_line { # or whatever you'll call it
+ my($line, $line_number, $parser) = @_;
+ ...
+ }
Note, however, that sometimes the Pod events aren't processed in exactly
the same order as the code lines are -- i.e., if you have a file with
=cut
-
-