+=encoding utf8
=head1 NAME
L</About Data Paragraphs and "=beginE<sol>=end" Regions>.
It is advised that formatnames match the regexp
-C<m/\A:?[−a−zA−Z0−9_]+\z/>. Everything following whitespace after the
+C<m/\A:?[-a-zA-Z0-9_]+\z/>. Everything following whitespace after the
formatname is a parameter that may be used by the formatter when dealing
with this region. This parameter must not be repeated in the "=end"
paragraph. Implementors should anticipate future expansion in the
and so on.
+Finally, the multiple-angle-bracket form does I<not> alter the interpretation
+of nested formatting codes, meaning that the following four example lines are
+identical in meaning:
+
+ B<example: C<$a E<lt>=E<gt> $b>>
+
+ B<example: C<< $a <=> $b >>>
+
+ B<example: C<< $a E<lt>=E<gt> $b >>>
+
+ B<<< example: C<< $a E<lt>=E<gt> $b >> >>>
+
=back
In parsing Pod, a notably tricky part is the correct parsing of
=item *
-A naive but sufficient heuristic for testing the first highbit
+A naive but often sufficient heuristic for testing the first highbit
byte-sequence in a BOM-less file (whether in code or in Pod!), to see
whether that sequence is valid as UTF-8 (RFC 2279) is to check whether
-that the first byte in the sequence is in the range 0xC0 - 0xFD
+that the first byte in the sequence is in the range 0xC2 - 0xFD
I<and> whether the next byte is in the range
0x80 - 0xBF. If so, the parser may conclude that this file is in
UTF-8, and all highbit sequences in the file should be assumed to
be UTF-8. Otherwise the parser should treat the file as being
-in Latin-1. In the unlikely circumstance that the first highbit
+in Latin-1. (A better check is to pass a copy of the sequence to
+L<utf8::decode()|utf8> which performs a full validity check on the
+sequence and returns TRUE if it is valid UTF-8, FALSE otherwise. This
+function is always pre-loaded, is fast because it is written in C, and
+will only get called at most once, so you don't need to avoid it out of
+performance concerns.)
+In the unlikely circumstance that the first highbit
sequence in a truly non-UTF-8 file happens to appear to be UTF-8, one
can cater to our heuristic (as well as any more intelligent heuristic)
by prefacing that line with a comment line containing a highbit
version numbers of any modules it might be using to process the Pod.
Minimal examples:
- %% POD::Pod2PS v3.14159, using POD::Parser v1.92
+ %% POD::Pod2PS v3.14159, using POD::Parser v1.92
- <!-- Pod::HTML v3.14159, using POD::Parser v1.92 -->
+ <!-- Pod::HTML v3.14159, using POD::Parser v1.92 -->
- {\doccomm generated by Pod::Tree::RTF 3.14159 using Pod::Tree 1.08}
+ {\doccomm generated by Pod::Tree::RTF 3.14159 using Pod::Tree 1.08}
- .\" Pod::Man version 3.14159, using POD::Parser version 1.92
+ .\" Pod::Man version 3.14159, using POD::Parser version 1.92
Formatters may also insert additional comments, including: the
release date of the Pod formatter program, the contact address for
For example:
L<Foo::Bar>
- => undef, # link text
- "Foo::Bar", # possibly inferred link text
- "Foo::Bar", # name
- undef, # section
- 'pod', # what sort of link
- "Foo::Bar" # original content
+ => undef, # link text
+ "Foo::Bar", # possibly inferred link text
+ "Foo::Bar", # name
+ undef, # section
+ 'pod', # what sort of link
+ "Foo::Bar" # original content
L<Perlport's section on NL's|perlport/Newlines>
- => "Perlport's section on NL's", # link text
- "Perlport's section on NL's", # possibly inferred link text
- "perlport", # name
- "Newlines", # section
- 'pod', # what sort of link
- "Perlport's section on NL's|perlport/Newlines" # orig. content
+ => "Perlport's section on NL's", # link text
+ "Perlport's section on NL's", # possibly inferred link text
+ "perlport", # name
+ "Newlines", # section
+ 'pod', # what sort of link
+ "Perlport's section on NL's|perlport/Newlines"
+ # original content
L<perlport/Newlines>
- => undef, # link text
- '"Newlines" in perlport', # possibly inferred link text
- "perlport", # name
- "Newlines", # section
- 'pod', # what sort of link
- "perlport/Newlines" # original content
+ => undef, # link text
+ '"Newlines" in perlport', # possibly inferred link text
+ "perlport", # name
+ "Newlines", # section
+ 'pod', # what sort of link
+ "perlport/Newlines" # original content
L<crontab(5)/"DESCRIPTION">
- => undef, # link text
- '"DESCRIPTION" in crontab(5)', # possibly inferred link text
- "crontab(5)", # name
- "DESCRIPTION", # section
- 'man', # what sort of link
- 'crontab(5)/"DESCRIPTION"' # original content
+ => undef, # link text
+ '"DESCRIPTION" in crontab(5)', # possibly inferred link text
+ "crontab(5)", # name
+ "DESCRIPTION", # section
+ 'man', # what sort of link
+ 'crontab(5)/"DESCRIPTION"' # original content
L</Object Attributes>
- => undef, # link text
- '"Object Attributes"', # possibly inferred link text
- undef, # name
- "Object Attributes", # section
- 'pod', # what sort of link
- "/Object Attributes" # original content
+ => undef, # link text
+ '"Object Attributes"', # possibly inferred link text
+ undef, # name
+ "Object Attributes", # section
+ 'pod', # what sort of link
+ "/Object Attributes" # original content
L<http://www.perl.org/>
- => undef, # link text
- "http://www.perl.org/", # possibly inferred link text
- "http://www.perl.org/", # name
- undef, # section
- 'url', # what sort of link
- "http://www.perl.org/" # original content
+ => undef, # link text
+ "http://www.perl.org/", # possibly inferred link text
+ "http://www.perl.org/", # name
+ undef, # section
+ 'url', # what sort of link
+ "http://www.perl.org/" # original content
L<Perl.org|http://www.perl.org/>
- => "Perl.org", # link text
- "http://www.perl.org/", # possibly inferred link text
- "http://www.perl.org/", # name
- undef, # section
- 'url', # what sort of link
+ => "Perl.org", # link text
+ "http://www.perl.org/", # possibly inferred link text
+ "http://www.perl.org/", # name
+ undef, # section
+ 'url', # what sort of link
"Perl.org|http://www.perl.org/" # original content
Note that you can distinguish URL-links from anything else by the
=item *
-Authors wanting to link to a particular (absolute) URL, must do so
-only with "LE<lt>scheme:...>" codes (like
-LE<lt>http://www.perl.org>), and must not attempt "LE<lt>Some Site
-Name|scheme:...>" codes. This restriction avoids many problems
-in parsing and rendering LE<lt>...> codes.
-
-=item *
-
In a C<LE<lt>text|...E<gt>> code, text may contain formatting codes
for formatting or for EE<lt>...> escapes, as in: