5 perlopentut - simple recipes for opening files and pipes in Perl
9 Whenever you do I/O on a file in Perl, you do so through what in Perl is
10 called a B<filehandle>. A filehandle is an internal name for an external
11 file. It is the job of the C<open> function to make the association
12 between the internal name and the external name, and it is the job
13 of the C<close> function to break that association.
15 For your convenience, Perl sets up a few special filehandles that are
16 already open when you run. These include C<STDIN>, C<STDOUT>, C<STDERR>,
17 and C<ARGV>. Since those are pre-opened, you can use them right away
18 without having to go to the trouble of opening them yourself:
20 print STDERR "This is a debugging message.\n";
22 print STDOUT "Please enter something: ";
23 $response = <STDIN> // die "how come no input?";
24 print STDOUT "Thank you!\n";
26 while (<ARGV>) { ... }
28 As you see from those examples, C<STDOUT> and C<STDERR> are output
29 handles, and C<STDIN> and C<ARGV> are input handles. They are
30 in all capital letters because they are reserved to Perl, much
31 like the C<@ARGV> array and the C<%ENV> hash are. Their external
32 associations were set up by your shell.
34 You will need to open every other filehandle on your own. Although there
35 are many variants, the most common way to call Perl's open() function
36 is with three arguments and one return value:
38 C< I<OK> = open(I<HANDLE>, I<MODE>, I<PATHNAME>)>
46 will be some defined value if the open succeeds, but
51 should be an undefined scalar variable to be filled in by the
52 C<open> function if it succeeds;
56 is the access mode and the encoding format to open the file with;
60 is the external name of the file you want opened.
64 Most of the complexity of the C<open> function lies in the many
65 possible values that the I<MODE> parameter can take on.
67 One last thing before we show you how to open files: opening
68 files does not (usually) automatically lock them in Perl. See
69 L<perlfaq5> for how to lock.
71 =head1 Opening Text Files
73 =head2 Opening Text Files for Reading
75 If you want to read from a text file, first open it in
76 read-only mode like this:
78 my $filename = "/some/path/to/a/textfile/goes/here";
79 my $encoding = ":encoding(UTF-8)";
80 my $handle = undef; # this will be filled in on success
82 open($handle, "< $encoding", $filename)
83 || die "$0: can't open $filename for reading: $!";
85 As with the shell, in Perl the C<< "<" >> is used to open the file in
86 read-only mode. If it succeeds, Perl allocates a brand new filehandle for
87 you and fills in your previously undefined C<$handle> argument with a
88 reference to that handle.
90 Now you may use functions like C<readline>, C<read>, C<getc>, and
91 C<sysread> on that handle. Probably the most common input function
92 is the one that looks like an operator:
94 $line = readline($handle);
95 $line = <$handle>; # same thing
97 Because the C<readline> function returns C<undef> at end of file or
98 upon error, you will sometimes see it used this way:
102 # do something with $line
105 # $line is not valid, so skip it
108 You can also just quickly C<die> on an undefined value this way:
110 $line = <$handle> // die "no input found";
112 However, if hitting EOF is an expected and normal event, you do not want to
113 exit simply because you have run out of input. Instead, you probably just want
114 to exit an input loop. You can then test to see if an actual error has caused
115 the loop to terminate, and act accordingly:
118 # do something with data in $_
121 die "unexpected error while reading from $filename: $!";
124 B<A Note on Encodings>: Having to specify the text encoding every time
125 might seem a bit of a bother. To set up a default encoding for C<open> so
126 that you don't have to supply it each time, you can use the C<open> pragma:
128 use open qw< :encoding(UTF-8) >;
130 Once you've done that, you can safely omit the encoding part of the
133 open($handle, "<", $filename)
134 || die "$0: can't open $filename for reading: $!";
136 But never use the bare C<< "<" >> without having set up a default encoding
137 first. Otherwise, Perl cannot know which of the many, many, many possible
138 flavors of text file you have, and Perl will have no idea how to correctly
139 map the data in your file into actual characters it can work with. Other
140 common encoding formats including C<"ASCII">, C<"ISO-8859-1">,
141 C<"ISO-8859-15">, C<"Windows-1252">, C<"MacRoman">, and even C<"UTF-16LE">.
142 See L<perlunitut> for more about encodings.
144 =head2 Opening Text Files for Writing
146 When you want to write to a file, you first have to decide what to do about
147 any existing contents of that file. You have two basic choices here: to
148 preserve or to clobber.
150 If you want to preserve any existing contents, then you want to open the file
151 in append mode. As in the shell, in Perl you use C<<< ">>" >>> to open an
152 existing file in append mode. C<<< ">>" >>> creates the file if it does not
156 my $filename = "/some/path/to/a/textfile/goes/here";
157 my $encoding = ":encoding(UTF-8)";
159 open($handle, ">> $encoding", $filename)
160 || die "$0: can't open $filename for appending: $!";
162 Now you can write to that filehandle using any of C<print>, C<printf>,
163 C<say>, C<write>, or C<syswrite>.
165 As noted above, if the file does not already exist, then the append-mode open
166 will create it for you. But if the file does already exist, its contents are
167 safe from harm because you will be adding your new text past the end of the
170 On the other hand, sometimes you want to clobber whatever might already be
171 there. To empty out a file before you start writing to it, you can open it
175 my $filename = "/some/path/to/a/textfile/goes/here";
176 my $encoding = ":encoding(UTF-8)";
178 open($handle, "> $encoding", $filename)
179 || die "$0: can't open $filename in write-open mode: $!";
181 Here again Perl works just like the shell in that the C<< ">" >> clobbers
184 As with the append mode, when you open a file in write-only mode,
185 you can now write to that filehandle using any of C<print>, C<printf>,
186 C<say>, C<write>, or C<syswrite>.
188 What about read-write mode? You should probably pretend it doesn't exist,
189 because opening text files in read-write mode is unlikely to do what you
190 would like. See L<perlfaq5> for details.
192 =head1 Opening Binary Files
194 If the file to be opened contains binary data instead of text characters,
195 then the C<MODE> argument to C<open> is a little different. Instead of
196 specifying the encoding, you tell Perl that your data are in raw bytes.
198 my $filename = "/some/path/to/a/binary/file/goes/here";
199 my $encoding = ":raw :bytes"
200 my $handle = undef; # this will be filled in on success
202 And then open as before, choosing C<<< "<" >>>, C<<< ">>" >>>, or
203 C<<< ">" >>> as needed:
205 open($handle, "< $encoding", $filename)
206 || die "$0: can't open $filename for reading: $!";
208 open($handle, ">> $encoding", $filename)
209 || die "$0: can't open $filename for appending: $!";
211 open($handle, "> $encoding", $filename)
212 || die "$0: can't open $filename in write-open mode: $!";
214 Alternately, you can change to binary mode on an existing handle this way:
216 binmode($handle) || die "cannot binmode handle";
218 This is especially handy for the handles that Perl has already opened for you.
220 binmode(STDIN) || die "cannot binmode STDIN";
221 binmode(STDOUT) || die "cannot binmode STDOUT";
223 You can also pass C<binmode> an explicit encoding to change it on the fly.
224 This isn't exactly "binary" mode, but we still use C<binmode> to do it:
226 binmode(STDIN, ":encoding(MacRoman)") || die "cannot binmode STDIN";
227 binmode(STDOUT, ":encoding(UTF-8)") || die "cannot binmode STDOUT";
229 Once you have your binary file properly opened in the right mode, you can
230 use all the same Perl I/O functions as you used on text files. However,
231 you may wish to use the fixed-size C<read> instead of the variable-sized
232 C<readline> for your input.
234 Here's an example of how to copy a binary file:
236 my $BUFSIZ = 64 * (2 ** 10);
237 my $name_in = "/some/input/file";
238 my $name_out = "/some/output/flie";
240 my($in_fh, $out_fh, $buffer);
242 open($in_fh, "<", $name_in)
243 || die "$0: cannot open $name_in for reading: $!";
244 open($out_fh, ">", $name_out)
245 || die "$0: cannot open $name_out for writing: $!";
247 for my $fh ($in_fh, $out_fh) {
248 binmode($fh) || die "binmode failed";
251 while (read($in_fh, $buffer, $BUFSIZ)) {
252 unless (print $out_fh $buffer) {
253 die "couldn't write to $name_out: $!";
257 close($in_fh) || die "couldn't close $name_in: $!";
258 close($out_fh) || die "couldn't close $name_out: $!";
262 Perl also lets you open a filehandle into an external program or shell
263 command rather than into a file. You can do this in order to pass data
264 from your Perl program to an external command for further processing, or
265 to receive data from another program for your own Perl program to
268 Filehandles into commands are also known as I<pipes>, since they work on
269 similar inter-process communication principles as Unix pipelines. Such a
270 filehandle has an active program instead of a static file on its
271 external end, but in every other sense it works just like a more typical
272 file-based filehandle, with all the techniques discussed earlier in this
273 article just as applicable.
275 As such, you open a pipe using the same C<open> call that you use for
276 opening files, setting the second (C<MODE>) argument to special
277 characters that indicate either an input or an output pipe. Use C<"-|"> for a
278 filehandle that will let your Perl program read data from an external
279 program, and C<"|-"> for a filehandle that will send data to that
282 =head2 Opening a pipe for reading
284 Let's say you'd like your Perl program to process data stored in a nearby
285 directory called C<unsorted>, which contains a number of textfiles.
286 You'd also like your program to sort all the contents from these files
287 into a single, alphabetically sorted list of unique lines before it
288 starts processing them.
290 You could do this through opening an ordinary filehandle into each of
291 those files, gradually building up an in-memory array of all the file
292 contents you load this way, and finally sorting and filtering that array
293 when you've run out of files to load. I<Or>, you could offload all that
294 merging and sorting into your operating system's own C<sort> command by
295 opening a pipe directly into its output, and get to work that much
298 Here's how that might look:
300 open(my $sort_fh, '-|', 'sort -u unsorted/*.txt')
301 or die "Couldn't open a pipe into sort: $!";
303 # And right away, we can start reading sorted lines:
304 while (my $line = <$sort_fh>) {
306 # ... Do something interesting with each $line here ...
310 The second argument to C<open>, C<"-|">, makes it a read-pipe into a
311 separate program, rather than an ordinary filehandle into a file.
313 Note that the third argument to C<open> is a string containing the
314 program name (C<sort>) plus all its arguments: in this case, C<-u> to
315 specify unqiue sort, and then a fileglob specifying the files to sort.
316 The resulting filehandle C<$sort_fh> works just like a read-only (C<<
317 "<" >>) filehandle, and your program can subsequently read data
318 from it as if it were opened onto an ordinary, single file.
320 =head2 Opening a pipe for writing
322 Continuing the previous example, let's say that your program has
323 completed its processing, and the results sit in an array called
324 C<@processed>. You want to print these lines to a file called
325 C<numbered.txt> with a neatly formatted column of line-numbers.
327 Certainly you could write your own code to do this — or, once again,
328 you could kick that work over to another program. In this case, C<cat>,
329 running with its own C<-n> option to activate line numbering, should do
332 open(my $cat_fh, '|-', 'cat -n > numbered.txt')
333 or die "Couldn't open a pipe into cat: $!";
335 for my $line (@processed) {
339 Here, we use a second C<open> argument of C<"|-">, signifying that the
340 filehandle assigned to C<$cat_fh> should be a write-pipe. We can then
341 use it just as we would a write-only ordinary filehandle, including the
342 basic function of C<print>-ing data to it.
344 Note that the third argument, specifying the command that we wish to
345 pipe to, sets up C<cat> to redirect its output via that C<< ">" >>
346 symbol into the file C<numbered.txt>. This can start to look a little
347 tricky, because that same symbol would have meant something
348 entirely different had it showed it in the second argument to C<open>!
349 But here in the third argument, it's simply part of the shell command that
350 Perl will open the pipe into, and Perl itself doesn't invest any special
353 =head2 Expressing the command as a list
355 For opening pipes, Perl offers the option to call C<open> with a list
356 comprising the desired command and all its own arguments as separate
357 elements, rather than combining them into a single string as in the
358 examples above. For instance, we could have phrased the C<open> call in
359 the first example like this:
361 open(my $sort_fh, '-|', 'sort', '-u', glob('unsorted/*.txt'))
362 or die "Couldn't open a pipe into sort: $!";
364 When you call C<open> this way, Perl invokes the given command directly,
365 bypassing the shell. As such, the shell won't try to interpret any
366 special characters within the command's argument list, which might
367 overwise have unwanted effects. This can make for safer, less
368 error-prone C<open> calls, useful in cases such as passing in variables
369 as arguments, or even just referring to filenames with spaces in them.
371 However, when you I<do> want to pass a meaningful metacharacter to the
372 shell, such with the C<"*"> inside that final C<unsorted/*.txt> argument
373 here, you can't use this alternate syntax. In this case, we have worked
374 around it via Perl's handy C<glob> built-in function, which evaluates
375 its argument into a list of filenames — and we can safely pass that
376 resulting list right into C<open>, as shown above.
378 Note also that representing piped-command arguments in list form like
379 this doesn't work on every platform. It will work on any Unix-based OS
380 that provides a real C<fork> function (e.g. macOS or Linux), as well as
381 on Windows when running Perl 5.22 or later.
385 The full documentation for L<C<open>|perlfunc/open FILEHANDLE,MODE,EXPR>
386 provides a thorough reference to this function, beyond the best-practice
389 =head1 AUTHOR and COPYRIGHT
391 Copyright 2013 Tom Christiansen; now maintained by Perl5 Porters
393 This documentation is free; you can redistribute it and/or modify it under
394 the same terms as Perl itself.