This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Don't patch perlopentut: rewrite it completely
[perl5.git] / pod / perlopentut.pod
CommitLineData
b25a8b16
TC
1=encoding utf8
2
f8284313
TC
3=head1 NAME
4
b25a8b16 5perlopentut - simple recipes for opening files and pipes in Perl
f8284313
TC
6
7=head1 DESCRIPTION
8
b25a8b16
TC
9Whenever you do I/O on a file in Perl, you do so through what in Perl is
10called a B<filehandle>. A filehandle is an internal name for an external
11file. It is the job of the C<open> function to make the association
12between the internal name and the external name, and it is the job
13of the C<close> function to break that associations.
f8284313 14
b25a8b16
TC
15For your convenience, Perl sets up a few special filehandles that are
16already open when you run. These include C<STDIN>, C<STDOUT>, C<STDERR>,
17and C<ARGV>. Since those are pre-opened, you can use them right away
18without having to go to the trouble of opening them yourself:
f8284313 19
b25a8b16 20 print STDERR "This is a debugging message.\n";
f8284313 21
b25a8b16
TC
22 print STDOUT "Please enter something: ";
23 $response = <STDIN> // die "how come no input?";
24 print STDOUT "Thank you!\n";
f8284313 25
b25a8b16 26 while (<ARGV>) { ... }
f8284313 27
b25a8b16
TC
28As you see from those examples, C<STDOUT> and C<STDERR> are output
29handles, and C<STDIN> and C<ARGV> are input handles. Those are
30in all capital letters because they are reserved to Perl, much
31like the C<@ARGV> array and the C<%ENV> hash are. Their external
32associations were set up by your shell.
f8284313 33
b25a8b16
TC
34For eveyrthing else, you will need to open it on your own. Although there
35are many other variants, the most common way to call Perl's open() function
36is with three arguments and one return value:
f8284313 37
b25a8b16 38C< I<OK> = open(I<HANDLE>, I<MODE>, I<PATHNAME>)>
f8284313 39
b25a8b16 40Where:
f8284313 41
b25a8b16 42=over
f8284313 43
b25a8b16 44=item I<OK>
f8284313 45
b25a8b16
TC
46will be some defined value if the open succeeds, but
47C<undef> if it fails;
f8284313 48
b25a8b16 49=item I<HANDLE>
1a193132 50
b25a8b16
TC
51should be an undefined scalar variable to be filled in by the
52C<open> function if it succeeds;
1a193132 53
b25a8b16 54=item I<MODE>
1a193132 55
b25a8b16 56is the access mode and the encoding format to open the file with;
f8284313 57
b25a8b16 58=item I<PATHNAME>
f8284313 59
b25a8b16 60is the external name of the file you want opened.
f8284313 61
b25a8b16 62=back
f8284313 63
b25a8b16
TC
64Most of the complexity of the C<open> function lies in the many
65possible values that the I<MODE> parameter can take on.
1a193132 66
b25a8b16
TC
67One last thing before we show you how to open files: opening
68files does not (usually) automatically lock them in Perl. See
69L<perlfaq4> for how to lock.
1a193132 70
b25a8b16 71=head1 Opening Text Files
1a193132 72
b25a8b16 73=head2 Opening Text Files for Reading
1a193132 74
b25a8b16
TC
75If you want to read from a text file, first open it in
76read-only mode like this:
1a193132 77
b25a8b16
TC
78 my $filename = "/some/path/to/a/textfile/goes/here";
79 my $encoding = ":encoding(UTF-8)";
80 my $handle = undef; # this will be filled in on success
1a193132 81
b25a8b16
TC
82 open($handle, "< $encoding", $filename)
83 || die "$0: can't open $filename for reading: $!\n";
1a193132 84
b25a8b16
TC
85As with the shell, in Perl the C<< "<" >> is used to open the file in
86read-only mode. If it succeeds, Perl allocates a brand new filehandle for
87you and fills in your previously undefined C<$handle> argument with a
88reference to that handle.
1a193132 89
b25a8b16
TC
90Now you may use functions like C<readline>, C<read>, C<getc>, and
91C<sysread> on that handle. Probably the most common input function
92is the one that looks like an operator:
1a193132 93
b25a8b16
TC
94 $line = readline($handle);
95 $line = <$handle>; # same thing
d7d7fefd 96
b25a8b16
TC
97Because the C<readline> function returns C<undef> at end of file or
98upon error, you will sometimes see it used this way:
d7d7fefd 99
b25a8b16
TC
100 $line = <$handle>;
101 if (defined $line) {
102 # do something with $line
d7d7fefd 103 }
b25a8b16
TC
104 else {
105 # $line is not valid, so skip it
494bd333 106 }
f8284313 107
b25a8b16 108You can also just quickly C<die> on an undefined value this way:
f8284313 109
b25a8b16 110 $line = <$handle> // die "no input found";
f8284313 111
b25a8b16
TC
112However, if hitting EOF is an expected and normal event, you
113would not to exit just because you ran out of input. Instead,
114you probably just want to exit an input loop. Immediately
115afterwards you can then test to see if there was an actual
116error that caused the loop to terminate, and act accordingly:
f8284313 117
b25a8b16
TC
118 while (<$handle>) {
119 # do something with data in $_
120 }
121 if ($!) {
122 die "unexpected error while reading from $filename: $!";
123 }
f8284313 124
b25a8b16
TC
125B<A Note on Encodings>: Having to specify the text encoding every time
126might seem a bit of a bother. To set up a default encoding for C<open> so
127that you don't have to supply it each time, you can use the C<open> pragma:
f8284313 128
b25a8b16 129 use open qw< :encoding(UTF-8) >;
f8284313 130
b25a8b16
TC
131Once you've done that, you can safely omit the encoding part of the
132open mode:
f8284313 133
b25a8b16
TC
134 open($handle, "<", $filename)
135 || die "$0: can't open $filename for reading: $!\n";
f8284313 136
b25a8b16
TC
137But never use the bare C<< "<" >> without having set up a default encoding
138first. Otherwise, Perl cannot know which of the many, many, many possible
139flavors of text file you have, and Perl will have no idea how to correctly
140map the data in your file into actual characters it can work with. Other
141common encoding formats including C<"ASCII">, C<"ISO-8859-1">,
142C<"ISO-8859-15">, C<"Windows-1252">, C<"MacRoman">, and even C<"UTF-16LE">.
143See L<perlunitut> for more about encodings.
f8284313 144
b25a8b16 145=head2 Opening Text Files for Writing
f8284313 146
b25a8b16
TC
147On the other hand, you want to write to a file, you first have to decide
148what to do about any existing contents. You have two basic choices here:
149to preserve or to clobber.
f8284313 150
b25a8b16
TC
151If you want to preserve any existing contents, then you want to open the
152file in append mode. As in the shell, in Perl you use C<<< ">>" >>> to
153open an existing file in append mode, and creates the file if it does not
154already exist.
f8284313 155
b25a8b16
TC
156 my $handle = undef;
157 my $filename = "/some/path/to/a/textfile/goes/here";
158 my $encoding = ":encoding(UTF-8)";
f8284313 159
b25a8b16
TC
160 open($handle, ">> $encoding", $filename)
161 || die "$0: can't open $filename for appending: $!\n";
f8284313 162
b25a8b16
TC
163Now you can write to that filehandle using any of C<print>, C<printf>,
164C<say>, C<write>, or C<syswrite>.
f8284313 165
b25a8b16
TC
166The file does not have to exist just to open it in append mode. If the
167file did not previously exist, then the append-mode open creates it for
168you. But if the file does previously exist, its contents are safe from
169harm because you will be adding your new text past the end of the old text.
f8284313 170
b25a8b16
TC
171On the other hand, sometimes you want to clobber whatever might already be
172there. To empty out a file before you start writing to it, you can open it
173in write-only mode:
f8284313 174
b25a8b16
TC
175 my $handle = undef;
176 my $filename = "/some/path/to/a/textfile/goes/here";
177 my $encoding = ":encoding(UTF-8)";
f8284313 178
b25a8b16
TC
179 open($handle, "> $encoding", $filename)
180 || die "$0: can't open $filename in write-open mode: $!\n";
f8284313 181
b25a8b16
TC
182Here again Perl works just like the shell in that the C<< ">" >> clobbers
183an existing file.
f8284313 184
b25a8b16
TC
185As with the append mode, when you open a file in write-only mode,
186you can now write to that filehandle using any of C<print>, C<printf>,
187C<say>, C<write>, or C<syswrite>.
f8284313 188
b25a8b16
TC
189What about read-write mode? You should probably pretend it doesn't exist,
190because opening text files in read-write mode is unlikely to do what you
191would like. See L<perlfaq4> for details.
f8284313 192
b25a8b16 193=head1 Opening Binary Files
f8284313 194
b25a8b16
TC
195If the file to be opened contains binary data instead of text characters,
196then the C<MODE> argument to C<open> is a little different. Instead of
197specifying the encoding, you tell Perl that your data are in raw bytes.
f8284313 198
b25a8b16
TC
199 my $filename = "/some/path/to/a/binary/file/goes/here";
200 my $encoding = ":raw :bytes"
201 my $handle = undef; # this will be filled in on success
f8284313 202
b25a8b16
TC
203And then open as before, choosing C<<< "<" >>>, C<<< ">>" >>>, or
204C<<< ">" >>> as needed:
f8284313 205
b25a8b16
TC
206 open($handle, "< $encoding", $filename)
207 || die "$0: can't open $filename for reading: $!\n";
f8284313 208
b25a8b16
TC
209 open($handle, ">> $encoding", $filename)
210 || die "$0: can't open $filename for appending: $!\n";
f8284313 211
b25a8b16
TC
212 open($handle, "> $encoding", $filename)
213 || die "$0: can't open $filename in write-open mode: $!\n";
f8284313 214
b25a8b16 215Alternately, you can change to binary mode on an existing handle this way:
f8284313 216
b25a8b16 217 binmode($handle) || die "cannot binmode handle";
f8284313 218
b25a8b16 219This is especially handy for the handles that Perl has already opened for you.
f8284313 220
b25a8b16
TC
221 binmode(STDIN) || die "cannot binmode STDIN";
222 binmode(STDOUT) || die "cannot binmode STDOUT";
f8284313 223
b25a8b16
TC
224You can also pass C<binmode> an explicit encoding to change it on the fly.
225This isn't exactly "binary" mode, but we still use C<binmode> to do it:
f8284313 226
b25a8b16
TC
227 binmode(STDIN, ":encoding(MacRoman)") || die "cannot binmode STDIN";
228 binmode(STDOUT, ":encoding(UTF-8)") || die "cannot binmode STDOUT";
f8284313 229
b25a8b16
TC
230Once you have your binary file properly opened in the right mode, you can
231use all the same Perl I/O functions as you used on text files. However,
232you may wish to use the fixed-size C<read> instead of the variable-sized
233C<readline> for your input.
f8284313 234
b25a8b16 235Here's an example of how to copy a binary file:
f8284313 236
b25a8b16
TC
237 my $BUFSIZ = 64 * (2 ** 10);
238 my $name_in = "/some/input/file";
239 my $name_out = "/some/output/flie";
f8284313 240
b25a8b16 241 my($in_fh, $out_fh, $buffer);
f8284313 242
b25a8b16
TC
243 open($in_fh, "<", $name_in) || die "$0: cannot open $name_in for reading: $!";
244 open($out_fh, ">", $name_out) || die "$0: cannot open $name_out for writing: $!";
f8284313 245
b25a8b16
TC
246 for my $fh ($in_fh, $out_fh) {
247 binmode($fh) || die "binmode failed";
248 }
f8284313 249
b25a8b16
TC
250 while (read($in_fh, $buffer, $BUFSIZ)) {
251 unless (print $out_fh $buffer) {
252 die "couldn't write to $name_out: $!";
253 }
254 }
f8284313 255
b25a8b16
TC
256 close($in_fh) || die "couldn't close $name_in: $!";
257 close($out_fh) || die "couldn't close $name_out: $!";
f8284313 258
b25a8b16 259=head1 Opening Pipes
f8284313 260
b25a8b16 261To be announced.
ae258fbb 262
b25a8b16 263=head1 Low-level File Opens via sysopen
ae258fbb 264
b25a8b16 265To be announced. Or deleted.
ae258fbb 266
b25a8b16 267=head1 SEE ALSO
f8284313 268
b25a8b16 269To be announced.
f8284313
TC
270
271=head1 AUTHOR and COPYRIGHT
272
b25a8b16 273To be announced.
f8284313 274
b25a8b16 275=head1 HISTORY
f8284313 276
b25a8b16 277To be announced.
f8284313 278
f8284313 279