utf16_to_utf8_reversed() should croak early when passed an odd byte length.

[perl5.git] / pod / perlfaq9.pod
diff --git a/pod/perlfaq9.pod b/pod/perlfaq9.pod

index 7fc0cdc..ce0cf07 100644 (file)
--- a/pod/perlfaq9.pod
+++ b/pod/perlfaq9.pod
@@ -1,45 +1,69 @@
  =head1 NAME
  
-perlfaq9 - Networking ($Revision: 1.26 $, $Date: 1999/05/23 16:08:30 $)
+perlfaq9 - Networking
  
  =head1 DESCRIPTION
  
  This section deals with questions related to networking, the internet,
  and a few on the web.
  
-=head2 My CGI script runs from the command line but not the browser.   (500 Server Error)
+=head2 What is the correct form of response from a CGI script?
  
-If you can demonstrate that you've read the following FAQs and that
-your problem isn't something simple that can be easily answered, you'll
-probably receive a courteous and useful reply to your question if you
-post it on comp.infosystems.www.authoring.cgi (if it's something to do
-with HTTP, HTML, or the CGI protocols).  Questions that appear to be Perl
-questions but are really CGI ones that are posted to comp.lang.perl.misc
-may not be so well received.
+(Alan Flavell <flavell+www@a5.ph.gla.ac.uk> answers...)
+
+The Common Gateway Interface (CGI) specifies a software interface between
+a program ("CGI script") and a web server (HTTPD). It is not specific
+to Perl, and has its own FAQs and tutorials, and usenet group,
+comp.infosystems.www.authoring.cgi
  
-The useful FAQs and related documents are:
+The CGI specification is outlined in an informational RFC:
+http://www.ietf.org/rfc/rfc3875
  
-    CGI FAQ
-        http://www.webthing.com/page.cgi/cgifaq
+Other relevant documentation listed in: http://www.perl.org/CGI_MetaFAQ.html
  
-    Web FAQ
-        http://www.boutell.com/faq/
+These Perl FAQs very selectively cover some CGI issues. However, Perl
+programmers are strongly advised to use the CGI.pm module, to take care
+of the details for them.
  
-    WWW Security FAQ
-        http://www.w3.org/Security/Faq/
+The similarity between CGI response headers (defined in the CGI
+specification) and HTTP response headers (defined in the HTTP
+specification, RFC2616) is intentional, but can sometimes be confusing.
  
-    HTTP Spec
-        http://www.w3.org/pub/WWW/Protocols/HTTP/
+The CGI specification defines two kinds of script: the "Parsed Header"
+script, and the "Non Parsed Header" (NPH) script. Check your server
+documentation to see what it supports. "Parsed Header" scripts are
+simpler in various respects. The CGI specification allows any of the
+usual newline representations in the CGI response (it's the server's
+job to create an accurate HTTP response based on it). So "\n" written in
+text mode is technically correct, and recommended. NPH scripts are more
+tricky: they must put out a complete and accurate set of HTTP
+transaction response headers; the HTTP specification calls for records
+to be terminated with carriage-return and line-feed, i.e ASCII \015\012
+written in binary mode.
  
-    HTML Spec
-        http://www.w3.org/TR/REC-html40/
-        http://www.w3.org/pub/WWW/MarkUp/
+Using CGI.pm gives excellent platform independence, including EBCDIC
+systems. CGI.pm selects an appropriate newline representation
+($CGI::CRLF) and sets binmode as appropriate.
  
-    CGI Spec
-        http://www.w3.org/CGI/
+=head2 My CGI script runs from the command line but not the browser.  (500 Server Error)
  
-    CGI Security FAQ
-        http://www.go2net.com/people/paulp/cgi-security/safe-cgi.txt
+Several things could be wrong.  You can go through the "Troubleshooting
+Perl CGI scripts" guide at
+
+       http://www.perl.org/troubleshooting_CGI.html
+
+If, after that, you can demonstrate that you've read the FAQs and that
+your problem isn't something simple that can be easily answered, you'll
+probably receive a courteous and useful reply to your question if you
+post it on comp.infosystems.www.authoring.cgi (if it's something to do
+with HTTP or the CGI protocols).  Questions that appear to be Perl
+questions but are really CGI ones that are posted to comp.lang.perl.misc
+are not so well received.
+
+The useful FAQs, related documents, and troubleshooting guides are
+listed in the CGI Meta FAQ:
+
+       http://www.perl.org/CGI_MetaFAQ.html
  
  =head2 How can I get better error messages from a CGI program?
  
@@ -48,25 +72,25 @@ normal Carp modules C<carp>, C<croak>, and C<confess> functions with
  more verbose and safer versions.  It still sends them to the normal
  server error log.
  
-    use CGI::Carp;
-    warn "This is a complaint";
-    die "But this one is serious";
+       use CGI::Carp;
+       warn "This is a complaint";
+       die "But this one is serious";
  
  The following use of CGI::Carp also redirects errors to a file of your choice,
  placed in a BEGIN block to catch compile-time warnings as well:
  
-    BEGIN {
-        use CGI::Carp qw(carpout);
-        open(LOG, ">>/var/local/cgi-logs/mycgi-log")
-            or die "Unable to append to mycgi-log: $!\n";
-        carpout(*LOG);
-    }
+       BEGIN {
+               use CGI::Carp qw(carpout);
+               open(LOG, ">>/var/local/cgi-logs/mycgi-log")
+                       or die "Unable to append to mycgi-log: $!\n";
+               carpout(*LOG);
+       }
  
  You can even arrange for fatal errors to go back to the client browser,
  which is nice for your own debugging, but might confuse the end user.
  
-    use CGI::Carp qw(fatalsToBrowser);
-    die "Bad error here";
+       use CGI::Carp qw(fatalsToBrowser);
+       die "Bad error here";
  
  Even if the error happens before you get the HTTP header out, the module
  will try to take care of this to avoid the dreaded server 500 errors.
@@ -77,223 +101,269 @@ stamp prepended.
  =head2 How do I remove HTML from a string?
  
  The most correct way (albeit not the fastest) is to use HTML::Parser
-from CPAN (part of the HTML-Tree package on CPAN).  Another correct
+from CPAN.  Another mostly correct
  way is to use HTML::FormatText which not only removes HTML but also
  attempts to do a little simple formatting of the resulting plain text.
  
  Many folks attempt a simple-minded regular expression approach, like
-C<s/E<lt>.*?E<gt>//g>, but that fails in many cases because the tags
+C<< s/<.*?>//g >>, but that fails in many cases because the tags
  may continue over line breaks, they may contain quoted angle-brackets,
-or HTML comment may be present.  Plus folks forget to convert
-entities, like C<&lt;> for example.
+or HTML comment may be present.  Plus, folks forget to convert
+entities--like C<&lt;> for example.
  
  Here's one "simple-minded" approach, that works for most files:
  
-    #!/usr/bin/perl -p0777
-    s/<(?:[^>'"]*|(['"]).*?\1)*>//gs
+       #!/usr/bin/perl -p0777
+       s/<(?:[^>'"]*|(['"]).*?\1)*>//gs
  
  If you want a more complete solution, see the 3-stage striphtml
  program in
-http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/striphtml.gz
+http://www.cpan.org/authors/Tom_Christiansen/scripts/striphtml.gz
  .
  
  Here are some tricky cases that you should think about when picking
  a solution:
  
-    <IMG SRC = "foo.gif" ALT = "A > B">
+       <IMG SRC = "foo.gif" ALT = "A > B">
  
-    <IMG SRC = "foo.gif"
+       <IMG SRC = "foo.gif"
          ALT = "A > B">
  
-    <!-- <A comment> -->
+       <!-- <A comment> -->
  
-    <script>if (a<b && a>c)</script>
+       <script>if (a<b && a>c)</script>
  
-    <# Just data #>
+       <# Just data #>
  
-    <![INCLUDE CDATA [ >>>>>>>>>>>> ]]>
+       <![INCLUDE CDATA [ >>>>>>>>>>>> ]]>
  
  If HTML comments include other tags, those solutions would also break
  on text like this:
  
-    <!-- This section commented out.
-        <B>You can't see me!</B>
-    -->
+       <!-- This section commented out.
+               <B>You can't see me!</B>
+       -->
  
  =head2 How do I extract URLs?
  
-A quick but imperfect approach is
+You can easily extract all sorts of URLs from HTML with
+C<HTML::SimpleLinkExtor> which handles anchors, images, objects,
+frames, and many other tags that can contain a URL.  If you need
+anything more complex, you can create your own subclass of
+C<HTML::LinkExtor> or C<HTML::Parser>.  You might even use
+C<HTML::SimpleLinkExtor> as an example for something specifically
+suited to your needs.
+
+You can use URI::Find to extract URLs from an arbitrary text document.
+
+Less complete solutions involving regular expressions can save
+you a lot of processing time if you know that the input is simple.  One
+solution from Tom Christiansen runs 100 times faster than most
+module based approaches but only extracts URLs from anchors where the first
+attribute is HREF and there are no other attributes.
+
+       #!/usr/bin/perl -n00
+       # qxurl - tchrist@perl.com
+       print "$2\n" while m{
+               < \s*
+                 A \s+ HREF \s* = \s* (["']) (.*?) \1
+               \s* >
+       }gsix;
  
-    #!/usr/bin/perl -n00
-    # qxurl - tchrist@perl.com
-    print "$2\n" while m{
-       < \s*
-         A \s+ HREF \s* = \s* (["']) (.*?) \1
-       \s* >
-    }gsix;
+=head2 How do I download a file from the user's machine?  How do I open a file on another machine?
  
-This version does not adjust relative URLs, understand alternate
-bases, deal with HTML comments, deal with HREF and NAME attributes
-in the same tag, understand extra qualifiers like TARGET, or accept
-URLs themselves as arguments.  It also runs about 100x faster than a
-more "complete" solution using the LWP suite of modules, such as the
-http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/xurl.gz program.
+In this case, download means to use the file upload feature of HTML
+forms.  You allow the web surfer to specify a file to send to your web
+server.  To you it looks like a download, and to the user it looks
+like an upload.  No matter what you call it, you do it with what's
+known as B<multipart/form-data> encoding.  The CGI.pm module (which
+comes with Perl as part of the Standard Library) supports this in the
+start_multipart_form() method, which isn't the same as the startform()
+method.
  
-=head2 How do I download a file from the user's machine?  How do I open a file on another machine?
+See the section in the CGI.pm documentation on file uploads for code
+examples and details.
+
+=head2 How do I make an HTML pop-up menu with Perl?
+
+(contributed by brian d foy)
  
-In the context of an HTML form, you can use what's known as
-B<multipart/form-data> encoding.  The CGI.pm module (available from
-CPAN) supports this in the start_multipart_form() method, which isn't
-the same as the startform() method.
+The CGI.pm module (which comes with Perl) has functions to create
+the HTML form widgets. See the CGI.pm documentation for more
+examples.
  
-=head2 How do I make a pop-up menu in HTML?
+       use CGI qw/:standard/;
+       print header,
+               start_html('Favorite Animals'),
  
-Use the B<E<lt>SELECTE<gt>> and B<E<lt>OPTIONE<gt>> tags.  The CGI.pm
-module (available from CPAN) supports this widget, as well as many
-others, including some that it cleverly synthesizes on its own.
+               start_form,
+                       "What's your favorite animal? ",
+               popup_menu(
+                       -name   => 'animal',
+                       -values => [ qw( Llama Alpaca Camel Ram ) ]
+                       ),
+               submit,
+
+               end_form,
+               end_html;
  
  =head2 How do I fetch an HTML file?
  
-One approach, if you have the lynx text-based HTML browser installed
-on your system, is this:
+(contributed by brian d foy)
+
+Use the libwww-perl distribution. The C<LWP::Simple> module can fetch web
+resources and give their content back to you as a string:
+
+       use LWP::Simple qw(get);
  
-    $html_code = `lynx -source $url`;
-    $text_data = `lynx -dump $url`;
+       my $html = get( "http://www.example.com/index.html" );
  
-The libwww-perl (LWP) modules from CPAN provide a more powerful way
-to do this.  They don't require lynx, but like lynx, can still work
-through proxies:
+It can also store the resource directly in a file:
  
-    # simplest version
-    use LWP::Simple;
-    $content = get($URL);
+       use LWP::Simple qw(getstore);
  
-    # or print HTML from a URL
-    use LWP::Simple;
-    getprint "http://www.sn.no/libwww-perl/";
+       getstore( "http://www.example.com/index.html", "foo.html" );
  
-    # or print ASCII from HTML from a URL
-    # also need HTML-Tree package from CPAN
-    use LWP::Simple;
-    use HTML::Parser;
-    use HTML::FormatText;
-    my ($html, $ascii);
-    $html = get("http://www.perl.com/");
-    defined $html
-        or die "Can't fetch HTML from http://www.perl.com/";
-    $ascii = HTML::FormatText->new->format(parse_html($html));
-    print $ascii;
+If you need to do something more complicated, you can use
+C<LWP::UserAgent> module to create your own user-agent (e.g. browser)
+to get the job done. If you want to simulate an interactive web
+browser, you can use the C<WWW::Mechanize> module.
  
  =head2 How do I automate an HTML form submission?
  
+If you are doing something complex, such as moving through many pages
+and forms or a web site, you can use C<WWW::Mechanize>.  See its
+documentation for all the details.
+
  If you're submitting values using the GET method, create a URL and encode
  the form using the C<query_form> method:
  
-    use LWP::Simple;
-    use URI::URL;
+       use LWP::Simple;
+       use URI::URL;
  
-    my $url = url('http://www.perl.com/cgi-bin/cpan_mod');
-    $url->query_form(module => 'DB_File', readme => 1);
-    $content = get($url);
+       my $url = url('http://www.perl.com/cgi-bin/cpan_mod');
+       $url->query_form(module => 'DB_File', readme => 1);
+       $content = get($url);
  
  If you're using the POST method, create your own user agent and encode
  the content appropriately.
  
-    use HTTP::Request::Common qw(POST);
-    use LWP::UserAgent;
+       use HTTP::Request::Common qw(POST);
+       use LWP::UserAgent;
  
-    $ua = LWP::UserAgent->new();
-    my $req = POST 'http://www.perl.com/cgi-bin/cpan_mod',
-                   [ module => 'DB_File', readme => 1 ];
-    $content = $ua->request($req)->as_string;
+       $ua = LWP::UserAgent->new();
+       my $req = POST 'http://www.perl.com/cgi-bin/cpan_mod',
+                                  [ module => 'DB_File', readme => 1 ];
+       $content = $ua->request($req)->as_string;
  
  =head2 How do I decode or create those %-encodings on the web?
+X<URI> X<CGI.pm> X<CGI> X<URI::Escape> X<RFC 2396>
+
+(contributed by brian d foy)
+
+Those C<%> encodings handle reserved characters in URIs, as described
+in RFC 2396, Section 2. This encoding replaces the reserved character
+with the hexadecimal representation of the character's number from
+the US-ASCII table. For instance, a colon, C<:>, becomes C<%3A>.
+
+In CGI scripts, you don't have to worry about decoding URIs if you are
+using C<CGI.pm>. You shouldn't have to process the URI yourself,
+either on the way in or the way out.
+
+If you have to encode a string yourself, remember that you should
+never try to encode an already-composed URI. You need to escape the
+components separately then put them together. To encode a string, you
+can use the the C<URI::Escape> module. The C<uri_escape> function
+returns the escaped string:
+
+       my $original = "Colon : Hash # Percent %";
  
-Here's an example of decoding:
+       my $escaped = uri_escape( $original )
  
-    $string = "http://altavista.digital.com/cgi-bin/query?pg=q&what=news&fmt=.&q=%2Bcgi-bin+%2Bperl.exe";
-    $string =~ s/%([a-fA-F0-9]{2})/chr(hex($1))/ge;
+       print "$string\n"; # 'Colon%20%3A%20Hash%20%23%20Percent%20%25%20'
  
-Encoding is a bit harder, because you can't just blindly change
-all the non-alphanumunder character (C<\W>) into their hex escapes.
-It's important that characters with special meaning like C</> and C<?>
-I<not> be translated.  Probably the easiest way to get this right is
-to avoid reinventing the wheel and just use the URI::Escape module,
-which is part of the libwww-perl package (LWP) available from CPAN.
+To decode the string, use the C<uri_unescape> function:
+
+       my $unescaped = uri_unescape( $escaped );
+
+       print $unescaped; # back to original
+
+If you wanted to do it yourself, you simply need to replace the
+reserved characters with their encodings. A global substitution
+is one way to do it:
+
+       # encode
+       $string =~ s/([^^A-Za-z0-9\-_.!~*'()])/ sprintf "%%%0x", ord $1 /eg;
+
+       #decode
+       $string =~ s/%([A-Fa-f\d]{2})/chr hex $1/eg;
  
  =head2 How do I redirect to another page?
  
-Instead of sending back a C<Content-Type> as the headers of your
-reply, send back a C<Location:> header.  Officially this should be a
-C<URI:> header, so the CGI.pm module (available from CPAN) sends back
-both:
+Specify the complete URL of the destination (even if it is on the same
+server). This is one of the two different kinds of CGI "Location:"
+responses which are defined in the CGI specification for a Parsed Headers
+script. The other kind (an absolute URLpath) is resolved internally to
+the server without any HTTP redirection. The CGI specifications do not
+allow relative URLs in either case.
  
-    Location: http://www.domain.com/newpage
-    URI: http://www.domain.com/newpage
+Use of CGI.pm is strongly recommended.  This example shows redirection
+with a complete URL. This redirection is handled by the web browser.
  
-Note that relative URLs in these headers can cause strange effects
-because of "optimizations" that servers do.
+       use CGI qw/:standard/;
  
-    $url = "http://www.perl.com/CPAN/";
-    print "Location: $url\n\n";
-    exit;
+       my $url = 'http://www.cpan.org/';
+       print redirect($url);
  
-To target a particular frame in a frameset, include the "Window-target:"
-in the header.
+This example shows a redirection with an absolute URLpath.  This
+redirection is handled by the local web server.
  
-    print <<EOF;
-    Location: http://www.domain.com/newpage
-    Window-target: <FrameName>
+       my $url = '/CPAN/index.html';
+       print redirect($url);
  
-    EOF
+But if coded directly, it could be as follows (the final "\n" is
+shown separately, for clarity), using either a complete URL or
+an absolute URLpath.
  
-To be correct to the spec, each of those virtual newlines should really be
-physical C<"\015\012"> sequences by the time you hit the client browser.
-Except for NPH scripts, though, that local newline should get translated
-by your server into standard form, so you shouldn't have a problem
-here, even if you are stuck on MacOS.  Everybody else probably won't
-even notice.
+       print "Location: $url\n";   # CGI response header
+       print "\n";                 # end of headers
  
  =head2 How do I put a password on my web pages?
  
-That depends.  You'll need to read the documentation for your web
-server, or perhaps check some of the other FAQs referenced above.
+To enable authentication for your web server, you need to configure
+your web server.  The configuration is different for different sorts
+of web servers--apache does it differently from iPlanet which does
+it differently from IIS.  Check your web server documentation for
+the details for your particular server.
  
  =head2 How do I edit my .htpasswd and .htgroup files with Perl?
  
  The HTTPD::UserAdmin and HTTPD::GroupAdmin modules provide a
  consistent OO interface to these files, regardless of how they're
-stored.  Databases may be text, dbm, Berkley DB or any database with a
-DBI compatible driver.  HTTPD::UserAdmin supports files used by the
-`Basic' and `Digest' authentication schemes.  Here's an example:
+stored.  Databases may be text, dbm, Berkeley DB or any database with
+a DBI compatible driver.  HTTPD::UserAdmin supports files used by the
+"Basic" and "Digest" authentication schemes.  Here's an example:
  
-    use HTTPD::UserAdmin ();
-    HTTPD::UserAdmin
+       use HTTPD::UserAdmin ();
+       HTTPD::UserAdmin
           ->new(DB => "/foo/.htpasswd")
           ->add($username => $password);
  
  =head2 How do I make sure users can't enter values into a form that cause my CGI script to do bad things?
  
-Read the CGI security FAQ, at
-http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html, and the
-Perl/CGI FAQ at
-http://www.perl.com/CPAN/doc/FAQs/cgi/perl-cgi-faq.html.
+See the security references listed in the CGI Meta FAQ
  
-In brief: use tainting (see L<perlsec>), which makes sure that data
-from outside your script (eg, CGI parameters) are never used in
-C<eval> or C<system> calls.  In addition to tainting, never use the
-single-argument form of system() or exec().  Instead, supply the
-command and arguments as a list, which prevents shell globbing.
+       http://www.perl.org/CGI_MetaFAQ.html
  
  =head2 How do I parse a mail header?
  
  For a quick-and-dirty solution, try this solution derived
-from page 222 of the 2nd edition of "Programming Perl":
+from L<perlfunc/split>:
  
-    $/ = '';
-    $header = <MSG>;
-    $header =~ s/\n\s+/ /g;     # merge continuation lines
-    %head = ( UNIX_FROM_LINE, split /^([-\w]+):\s*/m, $header );
+       $/ = '';
+       $header = <MSG>;
+       $header =~ s/\n\s+/ /g;  # merge continuation lines
+       %head = ( UNIX_FROM_LINE, split /^([-\w]+):\s*/m, $header );
  
  That solution doesn't do well if, for example, you're trying to
  maintain all the Received lines.  A more complete approach is to use
@@ -301,107 +371,126 @@ the Mail::Header module from CPAN (part of the MailTools package).
  
  =head2 How do I decode a CGI form?
  
-You use a standard module, probably CGI.pm.  Under no circumstances
-should you attempt to do so by hand!
-
-You'll see a lot of CGI programs that blindly read from STDIN the number
-of bytes equal to CONTENT_LENGTH for POSTs, or grab QUERY_STRING for
-decoding GETs.  These programs are very poorly written.  They only work
-sometimes.  They typically forget to check the return value of the read()
-system call, which is a cardinal sin.  They don't handle HEAD requests.
-They don't handle multipart forms used for file uploads.  They don't deal
-with GET/POST combinations where query fields are in more than one place.
-They don't deal with keywords in the query string.
-
-In short, they're bad hacks.  Resist them at all costs.  Please do not be
-tempted to reinvent the wheel.  Instead, use the CGI.pm or CGI_Lite.pm
-(available from CPAN), or if you're trapped in the module-free land
-of perl1 .. perl4, you might look into cgi-lib.pl (available from
-http://cgi-lib.stanford.edu/cgi-lib/ ).
-
-Make sure you know whether to use a GET or a POST in your form.
-GETs should only be used for something that doesn't update the server.
-Otherwise you can get mangled databases and repeated feedback mail
-messages.  The fancy word for this is ``idempotency''.  This simply
-means that there should be no difference between making a GET request
-for a particular URL once or multiple times.  This is because the
-HTTP protocol definition says that a GET request may be cached by the
-browser, or server, or an intervening proxy.  POST requests cannot be
-cached, because each request is independent and matters.  Typically,
-POST requests change or depend on state on the server (query or update
-a database, send mail, or purchase a computer).
+(contributed by brian d foy)
+
+Use the CGI.pm module that comes with Perl.  It's quick,
+it's easy, and it actually does quite a bit of work to
+ensure things happen correctly.  It handles GET, POST, and
+HEAD requests, multipart forms, multivalued fields, query
+string and message body combinations, and many other things
+you probably don't want to think about.
+
+It doesn't get much easier: the CGI module automatically
+parses the input and makes each value available through the
+C<param()> function.
+
+       use CGI qw(:standard);
+
+       my $total = param( 'price' ) + param( 'shipping' );
+
+       my @items = param( 'item' ); # multiple values, same field name
+
+If you want an object-oriented approach, CGI.pm can do that too.
+
+       use CGI;
+
+       my $cgi = CGI->new();
+
+       my $total = $cgi->param( 'price' ) + $cgi->param( 'shipping' );
+
+       my @items = $cgi->param( 'item' );
+
+You might also try CGI::Minimal which is a lightweight version
+of the same thing.  Other CGI::* modules on CPAN might work better
+for you, too.
+
+Many people try to write their own decoder (or copy one from
+another program) and then run into one of the many "gotchas"
+of the task.  It's much easier and less hassle to use CGI.pm.
  
  =head2 How do I check a valid mail address?
  
-You can't, at least, not in real time.  Bummer, eh?
+(partly contributed by Aaron Sherman)
  
-Without sending mail to the address and seeing whether there's a human
-on the other hand to answer you, you cannot determine whether a mail
-address is valid.  Even if you apply the mail header standard, you
-can have problems, because there are deliverable addresses that aren't
-RFC-822 (the mail header standard) compliant, and addresses that aren't
-deliverable which are compliant.
-
-Many are tempted to try to eliminate many frequently-invalid
-mail addresses with a simple regex, such as
-C</^[\w.-]+\@([\w.-]\.)+\w+$/>.  It's a very bad idea.  However,
-this also throws out many valid ones, and says nothing about
-potential deliverability, so is not suggested.  Instead, see
-http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/ckaddr.gz ,
-which actually checks against the full RFC spec (except for nested
-comments), looks for addresses you may not wish to accept mail to
-(say, Bill Clinton or your postmaster), and then makes sure that the
-hostname given can be looked up in the DNS MX records.  It's not fast,
-but it works for what it tries to do.
+This isn't as simple a question as it sounds.  There are two parts:
  
-Our best advice for verifying a person's mail address is to have them
-enter their address twice, just as you normally do to change a password.
-This usually weeds out typos.  If both versions match, send
-mail to that address with a personal message that looks somewhat like:
+a) How do I verify that an email address is correctly formatted?
  
-    Dear someuser@host.com,
+b) How do I verify that an email address targets a valid recipient?
  
-    Please confirm the mail address you gave us Wed May  6 09:38:41
-    MDT 1998 by replying to this message.  Include the string
-    "Rumpelstiltskin" in that reply, but spelled in reverse; that is,
-    start with "Nik...".  Once this is done, your confirmed address will
-    be entered into our records.
+Without sending mail to the address and seeing whether there's a human
+on the other end to answer you, you cannot fully answer part I<b>, but
+either the C<Email::Valid> or the C<RFC::RFC822::Address> module will do
+both part I<a> and part I<b> as far as you can in real-time.
+
+If you want to just check part I<a> to see that the address is valid
+according to the mail header standard with a simple regular expression,
+you can have problems, because there are deliverable addresses that
+aren't RFC-2822 (the latest mail header standard) compliant, and
+addresses that aren't deliverable which, are compliant.  However,  the
+following will match valid RFC-2822 addresses that do not have comments,
+folding whitespace, or any other obsolete or non-essential elements.
+This I<just> matches the address itself:
+
+       my $atom       = qr{[a-zA-Z0-9_!#\$\%&'*+/=?\^`{}~|\-]+};
+       my $dot_atom   = qr{$atom(?:\.$atom)*};
+       my $quoted     = qr{"(?:\\[^\r\n]|[^\\"])*"};
+       my $local      = qr{(?:$dot_atom|$quoted)};
+       my $quotedpair = qr{\\[\x00-\x09\x0B-\x0c\x0e-\x7e]};
+       my $domain_lit = qr{\[(?:$quotedpair|[\x21-\x5a\x5e-\x7e])*\]};
+       my $domain     = qr{(?:$dot_atom|$domain_lit)};
+       my $addr_spec  = qr{$local\@$domain};
+
+Just match an address against C</^${addr_spec}$/> to see if it follows
+the RFC2822 specification.  However, because it is impossible to be
+sure that such a correctly formed address is actually the correct way
+to reach a particular person or even has a mailbox associated with it,
+you must be very careful about how you use this.
  
-If you get the message back and they've followed your directions,
-you can be reasonably assured that it's real.
+Our best advice for verifying a person's mail address is to have them
+enter their address twice, just as you normally do to change a
+password. This usually weeds out typos. If both versions match, send
+mail to that address with a personal message. If you get the message
+back and they've followed your directions, you can be reasonably
+assured that it's real.
  
  A related strategy that's less open to forgery is to give them a PIN
  (personal ID number).  Record the address and PIN (best that it be a
-random one) for later processing.  In the mail you send, ask them to
+random one) for later processing. In the mail you send, ask them to
  include the PIN in their reply.  But if it bounces, or the message is
-included via a ``vacation'' script, it'll be there anyway.  So it's
+included via a "vacation" script, it'll be there anyway.  So it's
  best to ask them to mail back a slight alteration of the PIN, such as
  with the characters reversed, one added or subtracted to each digit, etc.
  
  =head2 How do I decode a MIME/BASE64 string?
  
-The MIME-tools package (available from CPAN) handles this and a lot
-more.  Decoding BASE64 becomes as simple as:
+The MIME-Base64 package (available from CPAN) handles this as well as
+the MIME/QP encoding.  Decoding BASE64 becomes as simple as:
  
-    use MIME::base64;
-    $decoded = decode_base64($encoded);
+       use MIME::Base64;
+       $decoded = decode_base64($encoded);
  
-A more direct approach is to use the unpack() function's "u"
+The MIME-Tools package (available from CPAN) supports extraction with
+decoding of BASE64 encoded attachments and content directly from email
+messages.
+
+If the string to decode is short (less than 84 bytes long)
+a more direct approach is to use the unpack() function's "u"
  format after minor transliterations:
  
-    tr#A-Za-z0-9+/##cd;                   # remove non-base64 chars
-    tr#A-Za-z0-9+/# -_#;                  # convert to uuencoded format
-    $len = pack("c", 32 + 0.75*length);   # compute length byte
-    print unpack("u", $len . $_);         # uudecode and print
+       tr#A-Za-z0-9+/##cd;                   # remove non-base64 chars
+       tr#A-Za-z0-9+/# -_#;                  # convert to uuencoded format
+       $len = pack("c", 32 + 0.75*length);   # compute length byte
+       print unpack("u", $len . $_);         # uudecode and print
  
  =head2 How do I return the user's mail address?
  
-On systems that support getpwuid, the $E<lt> variable and the
+On systems that support getpwuid, the $< variable, and the
  Sys::Hostname module (which is part of the standard perl distribution),
  you can probably try using something like this:
  
-    use Sys::Hostname;
-    $address = sprintf('%s@%s', scalar getpwuid($<), hostname);
+       use Sys::Hostname;
+       $address = sprintf('%s@%s', scalar getpwuid($<), hostname);
  
  Company policies on mail address can mean that this generates addresses
  that the company's mail system will not accept, so you should ask for
@@ -418,17 +507,17 @@ Again, the best way is often just to ask the user.
  
  Use the C<sendmail> program directly:
  
-    open(SENDMAIL, "|/usr/lib/sendmail -oi -t -odq")
-                        or die "Can't fork for sendmail: $!\n";
-    print SENDMAIL <<"EOF";
-    From: User Originating Mail <me\@host>
-    To: Final Destination <you\@otherhost>
-    Subject: A relevant subject line
+       open(SENDMAIL, "|/usr/lib/sendmail -oi -t -odq")
+               or die "Can't fork for sendmail: $!\n";
+       print SENDMAIL <<"EOF";
+       From: User Originating Mail <me\@host>
+       To: Final Destination <you\@otherhost>
+       Subject: A relevant subject line
  
-    Body of the message goes here after the blank line
-    in as many lines as you like.
-    EOF
-    close(SENDMAIL)     or warn "sendmail didn't close nicely";
+       Body of the message goes here after the blank line
+       in as many lines as you like.
+       EOF
+       close(SENDMAIL)     or warn "sendmail didn't close nicely";
  
  The B<-oi> option prevents sendmail from interpreting a line consisting
  of a single dot as "end of message".  The B<-t> option says to use the
@@ -444,85 +533,127 @@ probably sendmail.
  
  Or you might be able use the CPAN module Mail::Mailer:
  
-    use Mail::Mailer;
+       use Mail::Mailer;
  
-    $mailer = Mail::Mailer->new();
-    $mailer->open({ From    => $from_address,
-                    To      => $to_address,
-                    Subject => $subject,
-                  })
-        or die "Can't open: $!\n";
-    print $mailer $body;
-    $mailer->close();
+       $mailer = Mail::Mailer->new();
+       $mailer->open({ From    => $from_address,
+                                       To      => $to_address,
+                                       Subject => $subject,
+                                 })
+               or die "Can't open: $!\n";
+       print $mailer $body;
+       $mailer->close();
  
  The Mail::Internet module uses Net::SMTP which is less Unix-centric than
  Mail::Mailer, but less reliable.  Avoid raw SMTP commands.  There
  are many reasons to use a mail transport agent like sendmail.  These
-include queueing, MX records, and security.
+include queuing, MX records, and security.
+
+=head2 How do I use MIME to make an attachment to a mail message?
+
+This answer is extracted directly from the MIME::Lite documentation.
+Create a multipart message (i.e., one with attachments).
+
+       use MIME::Lite;
+
+       ### Create a new multipart message:
+       $msg = MIME::Lite->new(
+                                From    =>'me@myhost.com',
+                                To      =>'you@yourhost.com',
+                                Cc      =>'some@other.com, some@more.com',
+                                Subject =>'A message with 2 parts...',
+                                Type    =>'multipart/mixed'
+                                );
+
+       ### Add parts (each "attach" has same arguments as "new"):
+       $msg->attach(Type     =>'TEXT',
+                                Data     =>"Here's the GIF file you wanted"
+                                );
+       $msg->attach(Type     =>'image/gif',
+                                Path     =>'aaa000123.gif',
+                                Filename =>'logo.gif'
+                                );
+
+       $text = $msg->as_string;
+
+MIME::Lite also includes a method for sending these things.
+
+       $msg->send;
+
+This defaults to using L<sendmail> but can be customized to use
+SMTP via L<Net::SMTP>.
  
  =head2 How do I read mail?
  
  While you could use the Mail::Folder module from CPAN (part of the
-MailFolder package) or the Mail::Internet module from CPAN (also part
-of the MailTools package), often a module is overkill, though.  Here's a
+MailFolder package) or the Mail::Internet module from CPAN (part
+of the MailTools package), often a module is overkill.  Here's a
  mail sorter.
  
-    #!/usr/bin/perl
-    # bysub1 - simple sort by subject
-    my(@msgs, @sub);
-    my $msgno = -1;
-    $/ = '';                    # paragraph reads
-    while (<>) {
-        if (/^From/m) {
-            /^Subject:\s*(?:Re:\s*)*(.*)/mi;
-            $sub[++$msgno] = lc($1) || '';
-        }
-        $msgs[$msgno] .= $_;
-    }
-    for my $i (sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msgs)) {
-        print $msgs[$i];
-    }
+       #!/usr/bin/perl
+
+       my(@msgs, @sub);
+       my $msgno = -1;
+       $/ = '';                    # paragraph reads
+       while (<>) {
+               if (/^From /m) {
+                       /^Subject:\s*(?:Re:\s*)*(.*)/mi;
+                       $sub[++$msgno] = lc($1) || '';
+               }
+               $msgs[$msgno] .= $_;
+       }
+       for my $i (sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msgs)) {
+               print $msgs[$i];
+       }
  
  Or more succinctly,
  
-    #!/usr/bin/perl -n00
-    # bysub2 - awkish sort-by-subject
-    BEGIN { $msgno = -1 }
-    $sub[++$msgno] = (/^Subject:\s*(?:Re:\s*)*(.*)/mi)[0] if /^From/m;
-    $msg[$msgno] .= $_;
-    END { print @msg[ sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msg) ] }
+       #!/usr/bin/perl -n00
+       # bysub2 - awkish sort-by-subject
+       BEGIN { $msgno = -1 }
+       $sub[++$msgno] = (/^Subject:\s*(?:Re:\s*)*(.*)/mi)[0] if /^From/m;
+       $msg[$msgno] .= $_;
+       END { print @msg[ sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msg) ] }
+
+=head2 How do I find out my hostname, domainname, or IP address?
+X<hostname, domainname, IP address, host, domain, hostfqdn, inet_ntoa,
+gethostbyname, Socket, Net::Domain, Sys::Hostname>
+
+(contributed by brian d foy)
  
-=head2 How do I find out my hostname/domainname/IP address?
+The Net::Domain module, which is part of the standard distribution starting
+in perl5.7.3, can get you the fully qualified domain name (FQDN), the host
+name, or the domain name.
  
-The normal way to find your own hostname is to call the C<`hostname`>
-program.  While sometimes expedient, this has some problems, such as
-not knowing whether you've got the canonical name or not.  It's one of
-those tradeoffs of convenience versus portability.
+       use Net::Domain qw(hostname hostfqdn hostdomain);
  
-The Sys::Hostname module (part of the standard perl distribution) will
-give you the hostname after which you can find out the IP address
-(assuming you have working DNS) with a gethostbyname() call.
+       my $host = hostfqdn();
  
-    use Socket;
-    use Sys::Hostname;
-    my $host = hostname();
-    my $addr = inet_ntoa(scalar gethostbyname($host || 'localhost'));
+The C<Sys::Hostname> module, included in the standard distribution since
+perl5.6, can also get the hostname.
  
-Probably the simplest way to learn your DNS domain name is to grok
-it out of /etc/resolv.conf, at least under Unix.  Of course, this
-assumes several things about your resolv.conf configuration, including
-that it exists.
+       use Sys::Hostname;
  
-(We still need a good DNS domain name-learning method for non-Unix
-systems.)
+       $host = hostname();
+
+To get the IP address, you can use the C<gethostbyname> built-in function
+to turn the name into a number. To turn that number into the dotted octet
+form (a.b.c.d) that most people expect, use the C<inet_ntoa> function
+from the <Socket> module, which also comes with perl.
+
+       use Socket;
+
+       my $address = inet_ntoa(
+               scalar gethostbyname( $host || 'localhost' )
+               );
  
  =head2 How do I fetch a news article or the active newsgroups?
  
  Use the Net::NNTP or News::NNTPClient modules, both available from CPAN.
-This can make tasks like fetching the newsgroup list as simple as:
+This can make tasks like fetching the newsgroup list as simple as
  
-    perl -MNews::NNTPClient
-      -e 'print News::NNTPClient->new->list("newsgroups")'
+       perl -MNews::NNTPClient
+         -e 'print News::NNTPClient->new->list("newsgroups")'
  
  =head2 How do I fetch/put an FTP file?
  
@@ -531,22 +662,26 @@ available from CPAN) is more complex but can put as well as fetch.
  
  =head2 How can I do RPC in Perl?
  
-A DCE::RPC module is being developed (but is not yet available), and
-will be released as part of the DCE-Perl package (available from
-CPAN).  The rpcgen suite, available from CPAN/authors/id/JAKE/, is
-an RPC stub generator and includes an RPC::ONC module.
+(Contributed by brian d foy)
+
+Use one of the RPC modules you can find on CPAN (
+http://search.cpan.org/search?query=RPC&mode=all ).
+
+=head1 REVISION
+
+Revision: $Revision$
+
+Date: $Date$
+
+See L<perlfaq> for source control details and availability.
  
  =head1 AUTHOR AND COPYRIGHT
  
-Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington.
-All rights reserved.
+Copyright (c) 1997-2009 Tom Christiansen, Nathan Torkington, and
+other authors as noted. All rights reserved.
  
-When included as part of the Standard Version of Perl, or as part of
-its complete documentation whether printed or otherwise, this work
-may be distributed only under the terms of Perl's Artistic License.
-Any distribution of this file or derivatives thereof I<outside>
-of that package require that special arrangements be made with
-copyright holder.
+This documentation is free; you can redistribute it and/or modify it
+under the same terms as Perl itself.
  
  Irrespective of its distribution, all code examples in this file
  are hereby placed into the public domain.  You are permitted and