This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Re: [Fwd: CPAN Upload: S/SM/SMUELLER/Attribute-Handlers-0.80.tar.gz]
[perl5.git] / pod / perlfaq5.pod
... / ...
CommitLineData
1=head1 NAME
2
3perlfaq5 - Files and Formats ($Revision: 10126 $)
4
5=head1 DESCRIPTION
6
7This section deals with I/O and the "f" issues: filehandles, flushing,
8formats, and footers.
9
10=head2 How do I flush/unbuffer an output filehandle? Why must I do this?
11X<flush> X<buffer> X<unbuffer> X<autoflush>
12
13Perl does not support truly unbuffered output (except insofar as you
14can C<syswrite(OUT, $char, 1)>), although it does support is "command
15buffering", in which a physical write is performed after every output
16command.
17
18The C standard I/O library (stdio) normally buffers characters sent to
19devices so that there isn't a system call for each byte. In most stdio
20implementations, the type of output buffering and the size of the
21buffer varies according to the type of device. Perl's C<print()> and
22C<write()> functions normally buffer output, while C<syswrite()>
23bypasses buffering all together.
24
25If you want your output to be sent immediately when you execute
26C<print()> or C<write()> (for instance, for some network protocols),
27you must set the handle's autoflush flag. This flag is the Perl
28variable C<$|> and when it is set to a true value, Perl will flush the
29handle's buffer after each C<print()> or C<write()>. Setting C<$|>
30affects buffering only for the currently selected default filehandle.
31You choose this handle with the one argument C<select()> call (see
32L<perlvar/$E<verbar>> and L<perlfunc/select>).
33
34Use C<select()> to choose the desired handle, then set its
35per-filehandle variables.
36
37 $old_fh = select(OUTPUT_HANDLE);
38 $| = 1;
39 select($old_fh);
40
41Some modules offer object-oriented access to handles and their
42variables, although they may be overkill if this is the only thing you
43do with them. You can use C<IO::Handle>:
44
45 use IO::Handle;
46 open my( $printer ), ">", "/dev/printer"); # but is this?
47 $printer->autoflush(1);
48
49or C<IO::Socket> (which inherits from C<IO::Handle>):
50
51 use IO::Socket; # this one is kinda a pipe?
52 my $sock = IO::Socket::INET->new( 'www.example.com:80' );
53
54 $sock->autoflush();
55
56You can also flush an C<IO::Handle> object without setting
57C<autoflush>. Call the C<flush> method to flush the buffer yourself:
58
59 use IO::Handle;
60 open my( $printer ), ">", "/dev/printer");
61 $printer->flush; # one time flush
62
63
64=head2 How do I change, delete, or insert a line in a file, or append to the beginning of a file?
65X<file, editing>
66
67(contributed by brian d foy)
68
69The basic idea of inserting, changing, or deleting a line from a text
70file involves reading and printing the file to the point you want to
71make the change, making the change, then reading and printing the rest
72of the file. Perl doesn't provide random access to lines (especially
73since the record input separator, C<$/>, is mutable), although modules
74such as C<Tie::File> can fake it.
75
76A Perl program to do these tasks takes the basic form of opening a
77file, printing its lines, then closing the file:
78
79 open my $in, '<', $file or die "Can't read old file: $!";
80 open my $out, '>', "$file.new" or die "Can't write new file: $!";
81
82 while( <$in> )
83 {
84 print $out $_;
85 }
86
87 close $out;
88
89Within that basic form, add the parts that you need to insert, change,
90or delete lines.
91
92To prepend lines to the beginning, print those lines before you enter
93the loop that prints the existing lines.
94
95 open my $in, '<', $file or die "Can't read old file: $!";
96 open my $out, '>', "$file.new" or die "Can't write new file: $!";
97
98 print "# Add this line to the top\n"; # <--- HERE'S THE MAGIC
99
100 while( <$in> )
101 {
102 print $out $_;
103 }
104
105 close $out;
106
107To change existing lines, insert the code to modify the lines inside
108the C<while> loop. In this case, the code finds all lowercased
109versions of "perl" and uppercases them. The happens for every line, so
110be sure that you're supposed to do that on every line!
111
112 open my $in, '<', $file or die "Can't read old file: $!";
113 open my $out, '>', "$file.new" or die "Can't write new file: $!";
114
115 print "# Add this line to the top\n";
116
117 while( <$in> )
118 {
119 s/\b(perl)\b/Perl/g;
120 print $out $_;
121 }
122
123 close $out;
124
125To change only a particular line, the input line number, C<$.>, is
126useful. First read and print the lines up to the one you want to
127change. Next, read the single line you want to change, change it, and
128print it. After that, read the rest of the lines and print those:
129
130 while( <$in> ) # print the lines before the change
131 {
132 print $out $_;
133 last if $. == 4; # line number before change
134 }
135
136 my $line = <$in>;
137 $line =~ s/\b(perl)\b/Perl/g;
138 print $out $line;
139
140 while( <$in> ) # print the rest of the lines
141 {
142 print $out $_;
143 }
144
145To skip lines, use the looping controls. The C<next> in this example
146skips comment lines, and the C<last> stops all processing once it
147encounters either C<__END__> or C<__DATA__>.
148
149 while( <$in> )
150 {
151 next if /^\s+#/; # skip comment lines
152 last if /^__(END|DATA)__$/; # stop at end of code marker
153 print $out $_;
154 }
155
156Do the same sort of thing to delete a particular line by using C<next>
157to skip the lines you don't want to show up in the output. This
158example skips every fifth line:
159
160 while( <$in> )
161 {
162 next unless $. % 5;
163 print $out $_;
164 }
165
166If, for some odd reason, you really want to see the whole file at once
167rather than processing line by line, you can slurp it in (as long as
168you can fit the whole thing in memory!):
169
170 open my $in, '<', $file or die "Can't read old file: $!"
171 open my $out, '>', "$file.new" or die "Can't write new file: $!";
172
173 my @lines = do { local $/; <$in> }; # slurp!
174
175 # do your magic here
176
177 print $out @lines;
178
179Modules such as C<File::Slurp> and C<Tie::File> can help with that
180too. If you can, however, avoid reading the entire file at once. Perl
181won't give that memory back to the operating system until the process
182finishes.
183
184You can also use Perl one-liners to modify a file in-place. The
185following changes all 'Fred' to 'Barney' in F<inFile.txt>, overwriting
186the file with the new contents. With the C<-p> switch, Perl wraps a
187C<while> loop around the code you specify with C<-e>, and C<-i> turns
188on in-place editing. The current line is in C<$_>. With C<-p>, Perl
189automatically prints the value of C<$_> at the end of the loop. See
190L<perlrun> for more details.
191
192 perl -pi -e 's/Fred/Barney/' inFile.txt
193
194To make a backup of C<inFile.txt>, give C<-i> a file extension to add:
195
196 perl -pi.bak -e 's/Fred/Barney/' inFile.txt
197
198To change only the fifth line, you can add a test checking C<$.>, the
199input line number, then only perform the operation when the test
200passes:
201
202 perl -pi -e 's/Fred/Barney/ if $. == 5' inFile.txt
203
204To add lines before a certain line, you can add a line (or lines!)
205before Perl prints C<$_>:
206
207 perl -pi -e 'print "Put before third line\n" if $. == 3' inFile.txt
208
209You can even add a line to the beginning of a file, since the current
210line prints at the end of the loop:
211
212 perl -pi -e 'print "Put before first line\n" if $. == 1' inFile.txt
213
214To insert a line after one already in the file, use the C<-n> switch.
215It's just like C<-p> except that it doesn't print C<$_> at the end of
216the loop, so you have to do that yourself. In this case, print C<$_>
217first, then print the line that you want to add.
218
219 perl -ni -e 'print; print "Put after fifth line\n" if $. == 5' inFile.txt
220
221To delete lines, only print the ones that you want.
222
223 perl -ni -e 'print unless /d/' inFile.txt
224
225 ... or ...
226
227 perl -pi -e 'next unless /d/' inFile.txt
228
229=head2 How do I count the number of lines in a file?
230X<file, counting lines> X<lines> X<line>
231
232One fairly efficient way is to count newlines in the file. The
233following program uses a feature of tr///, as documented in L<perlop>.
234If your text file doesn't end with a newline, then it's not really a
235proper text file, so this may report one fewer line than you expect.
236
237 $lines = 0;
238 open(FILE, $filename) or die "Can't open `$filename': $!";
239 while (sysread FILE, $buffer, 4096) {
240 $lines += ($buffer =~ tr/\n//);
241 }
242 close FILE;
243
244This assumes no funny games with newline translations.
245
246=head2 How can I use Perl's C<-i> option from within a program?
247X<-i> X<in-place>
248
249C<-i> sets the value of Perl's C<$^I> variable, which in turn affects
250the behavior of C<< <> >>; see L<perlrun> for more details. By
251modifying the appropriate variables directly, you can get the same
252behavior within a larger program. For example:
253
254 # ...
255 {
256 local($^I, @ARGV) = ('.orig', glob("*.c"));
257 while (<>) {
258 if ($. == 1) {
259 print "This line should appear at the top of each file\n";
260 }
261 s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
262 print;
263 close ARGV if eof; # Reset $.
264 }
265 }
266 # $^I and @ARGV return to their old values here
267
268This block modifies all the C<.c> files in the current directory,
269leaving a backup of the original data from each file in a new
270C<.c.orig> file.
271
272=head2 How can I copy a file?
273X<copy> X<file, copy>
274
275(contributed by brian d foy)
276
277Use the File::Copy module. It comes with Perl and can do a
278true copy across file systems, and it does its magic in
279a portable fashion.
280
281 use File::Copy;
282
283 copy( $original, $new_copy ) or die "Copy failed: $!";
284
285If you can't use File::Copy, you'll have to do the work yourself:
286open the original file, open the destination file, then print
287to the destination file as you read the original.
288
289=head2 How do I make a temporary file name?
290X<file, temporary>
291
292If you don't need to know the name of the file, you can use C<open()>
293with C<undef> in place of the file name. The C<open()> function
294creates an anonymous temporary file.
295
296 open my $tmp, '+>', undef or die $!;
297
298Otherwise, you can use the File::Temp module.
299
300 use File::Temp qw/ tempfile tempdir /;
301
302 $dir = tempdir( CLEANUP => 1 );
303 ($fh, $filename) = tempfile( DIR => $dir );
304
305 # or if you don't need to know the filename
306
307 $fh = tempfile( DIR => $dir );
308
309The File::Temp has been a standard module since Perl 5.6.1. If you
310don't have a modern enough Perl installed, use the C<new_tmpfile>
311class method from the IO::File module to get a filehandle opened for
312reading and writing. Use it if you don't need to know the file's name:
313
314 use IO::File;
315 $fh = IO::File->new_tmpfile()
316 or die "Unable to make new temporary file: $!";
317
318If you're committed to creating a temporary file by hand, use the
319process ID and/or the current time-value. If you need to have many
320temporary files in one process, use a counter:
321
322 BEGIN {
323 use Fcntl;
324 my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMPDIR} || $ENV{TEMP};
325 my $base_name = sprintf "%s/%d-%d-0000", $temp_dir, $$, time;
326
327 sub temp_file {
328 local *FH;
329 my $count = 0;
330 until( defined(fileno(FH)) || $count++ > 100 ) {
331 $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
332 # O_EXCL is required for security reasons.
333 sysopen FH, $base_name, O_WRONLY|O_EXCL|O_CREAT;
334 }
335
336 if( defined fileno(FH) ) {
337 return (*FH, $base_name);
338 }
339 else {
340 return ();
341 }
342 }
343
344 }
345
346=head2 How can I manipulate fixed-record-length files?
347X<fixed-length> X<file, fixed-length records>
348
349The most efficient way is using L<pack()|perlfunc/"pack"> and
350L<unpack()|perlfunc/"unpack">. This is faster than using
351L<substr()|perlfunc/"substr"> when taking many, many strings. It is
352slower for just a few.
353
354Here is a sample chunk of code to break up and put back together again
355some fixed-format input lines, in this case from the output of a normal,
356Berkeley-style ps:
357
358 # sample input line:
359 # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
360 my $PS_T = 'A6 A4 A7 A5 A*';
361 open my $ps, '-|', 'ps';
362 print scalar <$ps>;
363 my @fields = qw( pid tt stat time command );
364 while (<$ps>) {
365 my %process;
366 @process{@fields} = unpack($PS_T, $_);
367 for my $field ( @fields ) {
368 print "$field: <$process{$field}>\n";
369 }
370 print 'line=', pack($PS_T, @process{@fields} ), "\n";
371 }
372
373We've used a hash slice in order to easily handle the fields of each row.
374Storing the keys in an array means it's easy to operate on them as a
375group or loop over them with for. It also avoids polluting the program
376with global variables and using symbolic references.
377
378=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?
379X<filehandle, local> X<filehandle, passing> X<filehandle, reference>
380
381As of perl5.6, open() autovivifies file and directory handles
382as references if you pass it an uninitialized scalar variable.
383You can then pass these references just like any other scalar,
384and use them in the place of named handles.
385
386 open my $fh, $file_name;
387
388 open local $fh, $file_name;
389
390 print $fh "Hello World!\n";
391
392 process_file( $fh );
393
394If you like, you can store these filehandles in an array or a hash.
395If you access them directly, they aren't simple scalars and you
396need to give C<print> a little help by placing the filehandle
397reference in braces. Perl can only figure it out on its own when
398the filehandle reference is a simple scalar.
399
400 my @fhs = ( $fh1, $fh2, $fh3 );
401
402 for( $i = 0; $i <= $#fhs; $i++ ) {
403 print {$fhs[$i]} "just another Perl answer, \n";
404 }
405
406Before perl5.6, you had to deal with various typeglob idioms
407which you may see in older code.
408
409 open FILE, "> $filename";
410 process_typeglob( *FILE );
411 process_reference( \*FILE );
412
413 sub process_typeglob { local *FH = shift; print FH "Typeglob!" }
414 sub process_reference { local $fh = shift; print $fh "Reference!" }
415
416If you want to create many anonymous handles, you should
417check out the Symbol or IO::Handle modules.
418
419=head2 How can I use a filehandle indirectly?
420X<filehandle, indirect>
421
422An indirect filehandle is using something other than a symbol
423in a place that a filehandle is expected. Here are ways
424to get indirect filehandles:
425
426 $fh = SOME_FH; # bareword is strict-subs hostile
427 $fh = "SOME_FH"; # strict-refs hostile; same package only
428 $fh = *SOME_FH; # typeglob
429 $fh = \*SOME_FH; # ref to typeglob (bless-able)
430 $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob
431
432Or, you can use the C<new> method from one of the IO::* modules to
433create an anonymous filehandle, store that in a scalar variable,
434and use it as though it were a normal filehandle.
435
436 use IO::Handle; # 5.004 or higher
437 $fh = IO::Handle->new();
438
439Then use any of those as you would a normal filehandle. Anywhere that
440Perl is expecting a filehandle, an indirect filehandle may be used
441instead. An indirect filehandle is just a scalar variable that contains
442a filehandle. Functions like C<print>, C<open>, C<seek>, or
443the C<< <FH> >> diamond operator will accept either a named filehandle
444or a scalar variable containing one:
445
446 ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
447 print $ofh "Type it: ";
448 $got = <$ifh>
449 print $efh "What was that: $got";
450
451If you're passing a filehandle to a function, you can write
452the function in two ways:
453
454 sub accept_fh {
455 my $fh = shift;
456 print $fh "Sending to indirect filehandle\n";
457 }
458
459Or it can localize a typeglob and use the filehandle directly:
460
461 sub accept_fh {
462 local *FH = shift;
463 print FH "Sending to localized filehandle\n";
464 }
465
466Both styles work with either objects or typeglobs of real filehandles.
467(They might also work with strings under some circumstances, but this
468is risky.)
469
470 accept_fh(*STDOUT);
471 accept_fh($handle);
472
473In the examples above, we assigned the filehandle to a scalar variable
474before using it. That is because only simple scalar variables, not
475expressions or subscripts of hashes or arrays, can be used with
476built-ins like C<print>, C<printf>, or the diamond operator. Using
477something other than a simple scalar variable as a filehandle is
478illegal and won't even compile:
479
480 @fd = (*STDIN, *STDOUT, *STDERR);
481 print $fd[1] "Type it: "; # WRONG
482 $got = <$fd[0]> # WRONG
483 print $fd[2] "What was that: $got"; # WRONG
484
485With C<print> and C<printf>, you get around this by using a block and
486an expression where you would place the filehandle:
487
488 print { $fd[1] } "funny stuff\n";
489 printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559;
490 # Pity the poor deadbeef.
491
492That block is a proper block like any other, so you can put more
493complicated code there. This sends the message out to one of two places:
494
495 $ok = -x "/bin/cat";
496 print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
497 print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
498
499This approach of treating C<print> and C<printf> like object methods
500calls doesn't work for the diamond operator. That's because it's a
501real operator, not just a function with a comma-less argument. Assuming
502you've been storing typeglobs in your structure as we did above, you
503can use the built-in function named C<readline> to read a record just
504as C<< <> >> does. Given the initialization shown above for @fd, this
505would work, but only because readline() requires a typeglob. It doesn't
506work with objects or strings, which might be a bug we haven't fixed yet.
507
508 $got = readline($fd[0]);
509
510Let it be noted that the flakiness of indirect filehandles is not
511related to whether they're strings, typeglobs, objects, or anything else.
512It's the syntax of the fundamental operators. Playing the object
513game doesn't help you at all here.
514
515=head2 How can I set up a footer format to be used with write()?
516X<footer>
517
518There's no builtin way to do this, but L<perlform> has a couple of
519techniques to make it possible for the intrepid hacker.
520
521=head2 How can I write() into a string?
522X<write, into a string>
523
524See L<perlform/"Accessing Formatting Internals"> for an C<swrite()> function.
525
526=head2 How can I open a filehandle to a string?
527X<string>, X<open>, X<IO::Scalar>, X<filehandle>
528
529(contributed by Peter J. Holzer, hjp-usenet2@hjp.at)
530
531Since Perl 5.8.0, you can pass a reference to a scalar instead of the
532filename to create a file handle which you can used to read from or write to
533a string:
534
535 open(my $fh, '>', \$string) or die "Could not open string for writing";
536 print $fh "foo\n";
537 print $fh "bar\n"; # $string now contains "foo\nbar\n"
538
539 open(my $fh, '<', \$string) or die "Could not open string for reading";
540 my $x = <$fh>; # $x now contains "foo\n"
541
542With older versions of Perl, the C<IO::String> module provides similar
543functionality.
544
545=head2 How can I output my numbers with commas added?
546X<number, commify>
547
548(contributed by brian d foy and Benjamin Goldberg)
549
550You can use L<Number::Format> to separate places in a number.
551It handles locale information for those of you who want to insert
552full stops instead (or anything else that they want to use,
553really).
554
555This subroutine will add commas to your number:
556
557 sub commify {
558 local $_ = shift;
559 1 while s/^([-+]?\d+)(\d{3})/$1,$2/;
560 return $_;
561 }
562
563This regex from Benjamin Goldberg will add commas to numbers:
564
565 s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g;
566
567It is easier to see with comments:
568
569 s/(
570 ^[-+]? # beginning of number.
571 \d+? # first digits before first comma
572 (?= # followed by, (but not included in the match) :
573 (?>(?:\d{3})+) # some positive multiple of three digits.
574 (?!\d) # an *exact* multiple, not x * 3 + 1 or whatever.
575 )
576 | # or:
577 \G\d{3} # after the last group, get three digits
578 (?=\d) # but they have to have more digits after them.
579 )/$1,/xg;
580
581=head2 How can I translate tildes (~) in a filename?
582X<tilde> X<tilde expansion>
583
584Use the <> (glob()) operator, documented in L<perlfunc>. Older
585versions of Perl require that you have a shell installed that groks
586tildes. Recent perl versions have this feature built in. The
587File::KGlob module (available from CPAN) gives more portable glob
588functionality.
589
590Within Perl, you may use this directly:
591
592 $filename =~ s{
593 ^ ~ # find a leading tilde
594 ( # save this in $1
595 [^/] # a non-slash character
596 * # repeated 0 or more times (0 means me)
597 )
598 }{
599 $1
600 ? (getpwnam($1))[7]
601 : ( $ENV{HOME} || $ENV{LOGDIR} )
602 }ex;
603
604=head2 How come when I open a file read-write it wipes it out?
605X<clobber> X<read-write> X<clobbering> X<truncate> X<truncating>
606
607Because you're using something like this, which truncates the file and
608I<then> gives you read-write access:
609
610 open(FH, "+> /path/name"); # WRONG (almost always)
611
612Whoops. You should instead use this, which will fail if the file
613doesn't exist.
614
615 open(FH, "+< /path/name"); # open for update
616
617Using ">" always clobbers or creates. Using "<" never does
618either. The "+" doesn't change this.
619
620Here are examples of many kinds of file opens. Those using sysopen()
621all assume
622
623 use Fcntl;
624
625To open file for reading:
626
627 open(FH, "< $path") || die $!;
628 sysopen(FH, $path, O_RDONLY) || die $!;
629
630To open file for writing, create new file if needed or else truncate old file:
631
632 open(FH, "> $path") || die $!;
633 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT) || die $!;
634 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666) || die $!;
635
636To open file for writing, create new file, file must not exist:
637
638 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT) || die $!;
639 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT, 0666) || die $!;
640
641To open file for appending, create if necessary:
642
643 open(FH, ">> $path") || die $!;
644 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT) || die $!;
645 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT, 0666) || die $!;
646
647To open file for appending, file must exist:
648
649 sysopen(FH, $path, O_WRONLY|O_APPEND) || die $!;
650
651To open file for update, file must exist:
652
653 open(FH, "+< $path") || die $!;
654 sysopen(FH, $path, O_RDWR) || die $!;
655
656To open file for update, create file if necessary:
657
658 sysopen(FH, $path, O_RDWR|O_CREAT) || die $!;
659 sysopen(FH, $path, O_RDWR|O_CREAT, 0666) || die $!;
660
661To open file for update, file must not exist:
662
663 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT) || die $!;
664 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT, 0666) || die $!;
665
666To open a file without blocking, creating if necessary:
667
668 sysopen(FH, "/foo/somefile", O_WRONLY|O_NDELAY|O_CREAT)
669 or die "can't open /foo/somefile: $!":
670
671Be warned that neither creation nor deletion of files is guaranteed to
672be an atomic operation over NFS. That is, two processes might both
673successfully create or unlink the same file! Therefore O_EXCL
674isn't as exclusive as you might wish.
675
676See also the new L<perlopentut> if you have it (new for 5.6).
677
678=head2 Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>?
679X<argument list too long>
680
681The C<< <> >> operator performs a globbing operation (see above).
682In Perl versions earlier than v5.6.0, the internal glob() operator forks
683csh(1) to do the actual glob expansion, but
684csh can't handle more than 127 items and so gives the error message
685C<Argument list too long>. People who installed tcsh as csh won't
686have this problem, but their users may be surprised by it.
687
688To get around this, either upgrade to Perl v5.6.0 or later, do the glob
689yourself with readdir() and patterns, or use a module like File::KGlob,
690one that doesn't use the shell to do globbing.
691
692=head2 Is there a leak/bug in glob()?
693X<glob>
694
695Due to the current implementation on some operating systems, when you
696use the glob() function or its angle-bracket alias in a scalar
697context, you may cause a memory leak and/or unpredictable behavior. It's
698best therefore to use glob() only in list context.
699
700=head2 How can I open a file with a leading ">" or trailing blanks?
701X<filename, special characters>
702
703(contributed by Brian McCauley)
704
705The special two argument form of Perl's open() function ignores
706trailing blanks in filenames and infers the mode from certain leading
707characters (or a trailing "|"). In older versions of Perl this was the
708only version of open() and so it is prevalent in old code and books.
709
710Unless you have a particular reason to use the two argument form you
711should use the three argument form of open() which does not treat any
712characters in the filename as special.
713
714 open FILE, "<", " file "; # filename is " file "
715 open FILE, ">", ">file"; # filename is ">file"
716
717=head2 How can I reliably rename a file?
718X<rename> X<mv> X<move> X<file, rename> X<ren>
719
720If your operating system supports a proper mv(1) utility or its
721functional equivalent, this works:
722
723 rename($old, $new) or system("mv", $old, $new);
724
725It may be more portable to use the File::Copy module instead.
726You just copy to the new file to the new name (checking return
727values), then delete the old one. This isn't really the same
728semantically as a rename(), which preserves meta-information like
729permissions, timestamps, inode info, etc.
730
731Newer versions of File::Copy export a move() function.
732
733=head2 How can I lock a file?
734X<lock> X<file, lock> X<flock>
735
736Perl's builtin flock() function (see L<perlfunc> for details) will call
737flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and
738later), and lockf(3) if neither of the two previous system calls exists.
739On some systems, it may even use a different form of native locking.
740Here are some gotchas with Perl's flock():
741
742=over 4
743
744=item 1
745
746Produces a fatal error if none of the three system calls (or their
747close equivalent) exists.
748
749=item 2
750
751lockf(3) does not provide shared locking, and requires that the
752filehandle be open for writing (or appending, or read/writing).
753
754=item 3
755
756Some versions of flock() can't lock files over a network (e.g. on NFS file
757systems), so you'd need to force the use of fcntl(2) when you build Perl.
758But even this is dubious at best. See the flock entry of L<perlfunc>
759and the F<INSTALL> file in the source distribution for information on
760building Perl to do this.
761
762Two potentially non-obvious but traditional flock semantics are that
763it waits indefinitely until the lock is granted, and that its locks are
764I<merely advisory>. Such discretionary locks are more flexible, but
765offer fewer guarantees. This means that files locked with flock() may
766be modified by programs that do not also use flock(). Cars that stop
767for red lights get on well with each other, but not with cars that don't
768stop for red lights. See the perlport manpage, your port's specific
769documentation, or your system-specific local manpages for details. It's
770best to assume traditional behavior if you're writing portable programs.
771(If you're not, you should as always feel perfectly free to write
772for your own system's idiosyncrasies (sometimes called "features").
773Slavish adherence to portability concerns shouldn't get in the way of
774your getting your job done.)
775
776For more information on file locking, see also
777L<perlopentut/"File Locking"> if you have it (new for 5.6).
778
779=back
780
781=head2 Why can't I just open(FH, "E<gt>file.lock")?
782X<lock, lockfile race condition>
783
784A common bit of code B<NOT TO USE> is this:
785
786 sleep(3) while -e "file.lock"; # PLEASE DO NOT USE
787 open(LCK, "> file.lock"); # THIS BROKEN CODE
788
789This is a classic race condition: you take two steps to do something
790which must be done in one. That's why computer hardware provides an
791atomic test-and-set instruction. In theory, this "ought" to work:
792
793 sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT)
794 or die "can't open file.lock: $!";
795
796except that lamentably, file creation (and deletion) is not atomic
797over NFS, so this won't work (at least, not every time) over the net.
798Various schemes involving link() have been suggested, but
799these tend to involve busy-wait, which is also less than desirable.
800
801=head2 I still don't get locking. I just want to increment the number in the file. How can I do this?
802X<counter> X<file, counter>
803
804Didn't anyone ever tell you web-page hit counters were useless?
805They don't count number of hits, they're a waste of time, and they serve
806only to stroke the writer's vanity. It's better to pick a random number;
807they're more realistic.
808
809Anyway, this is what you can do if you can't help yourself.
810
811 use Fcntl qw(:DEFAULT :flock);
812 sysopen(FH, "numfile", O_RDWR|O_CREAT) or die "can't open numfile: $!";
813 flock(FH, LOCK_EX) or die "can't flock numfile: $!";
814 $num = <FH> || 0;
815 seek(FH, 0, 0) or die "can't rewind numfile: $!";
816 truncate(FH, 0) or die "can't truncate numfile: $!";
817 (print FH $num+1, "\n") or die "can't write numfile: $!";
818 close FH or die "can't close numfile: $!";
819
820Here's a much better web-page hit counter:
821
822 $hits = int( (time() - 850_000_000) / rand(1_000) );
823
824If the count doesn't impress your friends, then the code might. :-)
825
826=head2 All I want to do is append a small amount of text to the end of a file. Do I still have to use locking?
827X<append> X<file, append>
828
829If you are on a system that correctly implements flock() and you use the
830example appending code from "perldoc -f flock" everything will be OK
831even if the OS you are on doesn't implement append mode correctly (if
832such a system exists.) So if you are happy to restrict yourself to OSs
833that implement flock() (and that's not really much of a restriction)
834then that is what you should do.
835
836If you know you are only going to use a system that does correctly
837implement appending (i.e. not Win32) then you can omit the seek() from
838the code in the previous answer.
839
840If you know you are only writing code to run on an OS and filesystem that
841does implement append mode correctly (a local filesystem on a modern
842Unix for example), and you keep the file in block-buffered mode and you
843write less than one buffer-full of output between each manual flushing
844of the buffer then each bufferload is almost guaranteed to be written to
845the end of the file in one chunk without getting intermingled with
846anyone else's output. You can also use the syswrite() function which is
847simply a wrapper around your systems write(2) system call.
848
849There is still a small theoretical chance that a signal will interrupt
850the system level write() operation before completion. There is also a
851possibility that some STDIO implementations may call multiple system
852level write()s even if the buffer was empty to start. There may be some
853systems where this probability is reduced to zero.
854
855=head2 How do I randomly update a binary file?
856X<file, binary patch>
857
858If you're just trying to patch a binary, in many cases something as
859simple as this works:
860
861 perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs
862
863However, if you have fixed sized records, then you might do something more
864like this:
865
866 $RECSIZE = 220; # size of record, in bytes
867 $recno = 37; # which record to update
868 open(FH, "+<somewhere") || die "can't update somewhere: $!";
869 seek(FH, $recno * $RECSIZE, 0);
870 read(FH, $record, $RECSIZE) == $RECSIZE || die "can't read record $recno: $!";
871 # munge the record
872 seek(FH, -$RECSIZE, 1);
873 print FH $record;
874 close FH;
875
876Locking and error checking are left as an exercise for the reader.
877Don't forget them or you'll be quite sorry.
878
879=head2 How do I get a file's timestamp in perl?
880X<timestamp> X<file, timestamp>
881
882If you want to retrieve the time at which the file was last
883read, written, or had its meta-data (owner, etc) changed,
884you use the B<-A>, B<-M>, or B<-C> file test operations as
885documented in L<perlfunc>. These retrieve the age of the
886file (measured against the start-time of your program) in
887days as a floating point number. Some platforms may not have
888all of these times. See L<perlport> for details. To
889retrieve the "raw" time in seconds since the epoch, you
890would call the stat function, then use localtime(),
891gmtime(), or POSIX::strftime() to convert this into
892human-readable form.
893
894Here's an example:
895
896 $write_secs = (stat($file))[9];
897 printf "file %s updated at %s\n", $file,
898 scalar localtime($write_secs);
899
900If you prefer something more legible, use the File::stat module
901(part of the standard distribution in version 5.004 and later):
902
903 # error checking left as an exercise for reader.
904 use File::stat;
905 use Time::localtime;
906 $date_string = ctime(stat($file)->mtime);
907 print "file $file updated at $date_string\n";
908
909The POSIX::strftime() approach has the benefit of being,
910in theory, independent of the current locale. See L<perllocale>
911for details.
912
913=head2 How do I set a file's timestamp in perl?
914X<timestamp> X<file, timestamp>
915
916You use the utime() function documented in L<perlfunc/utime>.
917By way of example, here's a little program that copies the
918read and write times from its first argument to all the rest
919of them.
920
921 if (@ARGV < 2) {
922 die "usage: cptimes timestamp_file other_files ...\n";
923 }
924 $timestamp = shift;
925 ($atime, $mtime) = (stat($timestamp))[8,9];
926 utime $atime, $mtime, @ARGV;
927
928Error checking is, as usual, left as an exercise for the reader.
929
930The perldoc for utime also has an example that has the same
931effect as touch(1) on files that I<already exist>.
932
933Certain file systems have a limited ability to store the times
934on a file at the expected level of precision. For example, the
935FAT and HPFS filesystem are unable to create dates on files with
936a finer granularity than two seconds. This is a limitation of
937the filesystems, not of utime().
938
939=head2 How do I print to more than one file at once?
940X<print, to multiple files>
941
942To connect one filehandle to several output filehandles,
943you can use the IO::Tee or Tie::FileHandle::Multiplex modules.
944
945If you only have to do this once, you can print individually
946to each filehandle.
947
948 for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
949
950=head2 How can I read in an entire file all at once?
951X<slurp> X<file, slurping>
952
953You can use the File::Slurp module to do it in one step.
954
955 use File::Slurp;
956
957 $all_of_it = read_file($filename); # entire file in scalar
958 @all_lines = read_file($filename); # one line perl element
959
960The customary Perl approach for processing all the lines in a file is to
961do so one line at a time:
962
963 open (INPUT, $file) || die "can't open $file: $!";
964 while (<INPUT>) {
965 chomp;
966 # do something with $_
967 }
968 close(INPUT) || die "can't close $file: $!";
969
970This is tremendously more efficient than reading the entire file into
971memory as an array of lines and then processing it one element at a time,
972which is often--if not almost always--the wrong approach. Whenever
973you see someone do this:
974
975 @lines = <INPUT>;
976
977you should think long and hard about why you need everything loaded at
978once. It's just not a scalable solution. You might also find it more
979fun to use the standard Tie::File module, or the DB_File module's
980$DB_RECNO bindings, which allow you to tie an array to a file so that
981accessing an element the array actually accesses the corresponding
982line in the file.
983
984You can read the entire filehandle contents into a scalar.
985
986 {
987 local(*INPUT, $/);
988 open (INPUT, $file) || die "can't open $file: $!";
989 $var = <INPUT>;
990 }
991
992That temporarily undefs your record separator, and will automatically
993close the file at block exit. If the file is already open, just use this:
994
995 $var = do { local $/; <INPUT> };
996
997For ordinary files you can also use the read function.
998
999 read( INPUT, $var, -s INPUT );
1000
1001The third argument tests the byte size of the data on the INPUT filehandle
1002and reads that many bytes into the buffer $var.
1003
1004=head2 How can I read in a file by paragraphs?
1005X<file, reading by paragraphs>
1006
1007Use the C<$/> variable (see L<perlvar> for details). You can either
1008set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">,
1009for instance, gets treated as two paragraphs and not three), or
1010C<"\n\n"> to accept empty paragraphs.
1011
1012Note that a blank line must have no blanks in it. Thus
1013S<C<"fred\n \nstuff\n\n">> is one paragraph, but C<"fred\n\nstuff\n\n"> is two.
1014
1015=head2 How can I read a single character from a file? From the keyboard?
1016X<getc> X<file, reading one character at a time>
1017
1018You can use the builtin C<getc()> function for most filehandles, but
1019it won't (easily) work on a terminal device. For STDIN, either use
1020the Term::ReadKey module from CPAN or use the sample code in
1021L<perlfunc/getc>.
1022
1023If your system supports the portable operating system programming
1024interface (POSIX), you can use the following code, which you'll note
1025turns off echo processing as well.
1026
1027 #!/usr/bin/perl -w
1028 use strict;
1029 $| = 1;
1030 for (1..4) {
1031 my $got;
1032 print "gimme: ";
1033 $got = getone();
1034 print "--> $got\n";
1035 }
1036 exit;
1037
1038 BEGIN {
1039 use POSIX qw(:termios_h);
1040
1041 my ($term, $oterm, $echo, $noecho, $fd_stdin);
1042
1043 $fd_stdin = fileno(STDIN);
1044
1045 $term = POSIX::Termios->new();
1046 $term->getattr($fd_stdin);
1047 $oterm = $term->getlflag();
1048
1049 $echo = ECHO | ECHOK | ICANON;
1050 $noecho = $oterm & ~$echo;
1051
1052 sub cbreak {
1053 $term->setlflag($noecho);
1054 $term->setcc(VTIME, 1);
1055 $term->setattr($fd_stdin, TCSANOW);
1056 }
1057
1058 sub cooked {
1059 $term->setlflag($oterm);
1060 $term->setcc(VTIME, 0);
1061 $term->setattr($fd_stdin, TCSANOW);
1062 }
1063
1064 sub getone {
1065 my $key = '';
1066 cbreak();
1067 sysread(STDIN, $key, 1);
1068 cooked();
1069 return $key;
1070 }
1071
1072 }
1073
1074 END { cooked() }
1075
1076The Term::ReadKey module from CPAN may be easier to use. Recent versions
1077include also support for non-portable systems as well.
1078
1079 use Term::ReadKey;
1080 open(TTY, "</dev/tty");
1081 print "Gimme a char: ";
1082 ReadMode "raw";
1083 $key = ReadKey 0, *TTY;
1084 ReadMode "normal";
1085 printf "\nYou said %s, char number %03d\n",
1086 $key, ord $key;
1087
1088=head2 How can I tell whether there's a character waiting on a filehandle?
1089
1090The very first thing you should do is look into getting the Term::ReadKey
1091extension from CPAN. As we mentioned earlier, it now even has limited
1092support for non-portable (read: not open systems, closed, proprietary,
1093not POSIX, not Unix, etc) systems.
1094
1095You should also check out the Frequently Asked Questions list in
1096comp.unix.* for things like this: the answer is essentially the same.
1097It's very system dependent. Here's one solution that works on BSD
1098systems:
1099
1100 sub key_ready {
1101 my($rin, $nfd);
1102 vec($rin, fileno(STDIN), 1) = 1;
1103 return $nfd = select($rin,undef,undef,0);
1104 }
1105
1106If you want to find out how many characters are waiting, there's
1107also the FIONREAD ioctl call to be looked at. The I<h2ph> tool that
1108comes with Perl tries to convert C include files to Perl code, which
1109can be C<require>d. FIONREAD ends up defined as a function in the
1110I<sys/ioctl.ph> file:
1111
1112 require 'sys/ioctl.ph';
1113
1114 $size = pack("L", 0);
1115 ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n";
1116 $size = unpack("L", $size);
1117
1118If I<h2ph> wasn't installed or doesn't work for you, you can
1119I<grep> the include files by hand:
1120
1121 % grep FIONREAD /usr/include/*/*
1122 /usr/include/asm/ioctls.h:#define FIONREAD 0x541B
1123
1124Or write a small C program using the editor of champions:
1125
1126 % cat > fionread.c
1127 #include <sys/ioctl.h>
1128 main() {
1129 printf("%#08x\n", FIONREAD);
1130 }
1131 ^D
1132 % cc -o fionread fionread.c
1133 % ./fionread
1134 0x4004667f
1135
1136And then hard code it, leaving porting as an exercise to your successor.
1137
1138 $FIONREAD = 0x4004667f; # XXX: opsys dependent
1139
1140 $size = pack("L", 0);
1141 ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n";
1142 $size = unpack("L", $size);
1143
1144FIONREAD requires a filehandle connected to a stream, meaning that sockets,
1145pipes, and tty devices work, but I<not> files.
1146
1147=head2 How do I do a C<tail -f> in perl?
1148X<tail> X<IO::Handle> X<File::Tail> X<clearerr>
1149
1150First try
1151
1152 seek(GWFILE, 0, 1);
1153
1154The statement C<seek(GWFILE, 0, 1)> doesn't change the current position,
1155but it does clear the end-of-file condition on the handle, so that the
1156next C<< <GWFILE> >> makes Perl try again to read something.
1157
1158If that doesn't work (it relies on features of your stdio implementation),
1159then you need something more like this:
1160
1161 for (;;) {
1162 for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) {
1163 # search for some stuff and put it into files
1164 }
1165 # sleep for a while
1166 seek(GWFILE, $curpos, 0); # seek to where we had been
1167 }
1168
1169If this still doesn't work, look into the C<clearerr> method
1170from C<IO::Handle>, which resets the error and end-of-file states
1171on the handle.
1172
1173There's also a C<File::Tail> module from CPAN.
1174
1175=head2 How do I dup() a filehandle in Perl?
1176X<dup>
1177
1178If you check L<perlfunc/open>, you'll see that several of the ways
1179to call open() should do the trick. For example:
1180
1181 open(LOG, ">>/foo/logfile");
1182 open(STDERR, ">&LOG");
1183
1184Or even with a literal numeric descriptor:
1185
1186 $fd = $ENV{MHCONTEXTFD};
1187 open(MHCONTEXT, "<&=$fd"); # like fdopen(3S)
1188
1189Note that "<&STDIN" makes a copy, but "<&=STDIN" make
1190an alias. That means if you close an aliased handle, all
1191aliases become inaccessible. This is not true with
1192a copied one.
1193
1194Error checking, as always, has been left as an exercise for the reader.
1195
1196=head2 How do I close a file descriptor by number?
1197X<file, closing file descriptors> X<POSIX> X<close>
1198
1199If, for some reason, you have a file descriptor instead of a
1200filehandle (perhaps you used C<POSIX::open>), you can use the
1201C<close()> function from the C<POSIX> module:
1202
1203 use POSIX ();
1204
1205 POSIX::close( $fd );
1206
1207This should rarely be necessary, as the Perl C<close()> function is to be
1208used for things that Perl opened itself, even if it was a dup of a
1209numeric descriptor as with C<MHCONTEXT> above. But if you really have
1210to, you may be able to do this:
1211
1212 require 'sys/syscall.ph';
1213 $rc = syscall(&SYS_close, $fd + 0); # must force numeric
1214 die "can't sysclose $fd: $!" unless $rc == -1;
1215
1216Or, just use the fdopen(3S) feature of C<open()>:
1217
1218 {
1219 open my( $fh ), "<&=$fd" or die "Cannot reopen fd=$fd: $!";
1220 close $fh;
1221 }
1222
1223=head2 Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work?
1224X<filename, DOS issues>
1225
1226Whoops! You just put a tab and a formfeed into that filename!
1227Remember that within double quoted strings ("like\this"), the
1228backslash is an escape character. The full list of these is in
1229L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't
1230have a file called "c:(tab)emp(formfeed)oo" or
1231"c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem.
1232
1233Either single-quote your strings, or (preferably) use forward slashes.
1234Since all DOS and Windows versions since something like MS-DOS 2.0 or so
1235have treated C</> and C<\> the same in a path, you might as well use the
1236one that doesn't clash with Perl--or the POSIX shell, ANSI C and C++,
1237awk, Tcl, Java, or Python, just to mention a few. POSIX paths
1238are more portable, too.
1239
1240=head2 Why doesn't glob("*.*") get all the files?
1241X<glob>
1242
1243Because even on non-Unix ports, Perl's glob function follows standard
1244Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden)
1245files. This makes glob() portable even to legacy systems. Your
1246port may include proprietary globbing functions as well. Check its
1247documentation for details.
1248
1249=head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl?
1250
1251This is elaborately and painstakingly described in the
1252F<file-dir-perms> article in the "Far More Than You Ever Wanted To
1253Know" collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz .
1254
1255The executive summary: learn how your filesystem works. The
1256permissions on a file say what can happen to the data in that file.
1257The permissions on a directory say what can happen to the list of
1258files in that directory. If you delete a file, you're removing its
1259name from the directory (so the operation depends on the permissions
1260of the directory, not of the file). If you try to write to the file,
1261the permissions of the file govern whether you're allowed to.
1262
1263=head2 How do I select a random line from a file?
1264X<file, selecting a random line>
1265
1266Here's an algorithm from the Camel Book:
1267
1268 srand;
1269 rand($.) < 1 && ($line = $_) while <>;
1270
1271This has a significant advantage in space over reading the whole file
1272in. You can find a proof of this method in I<The Art of Computer
1273Programming>, Volume 2, Section 3.4.2, by Donald E. Knuth.
1274
1275You can use the File::Random module which provides a function
1276for that algorithm:
1277
1278 use File::Random qw/random_line/;
1279 my $line = random_line($filename);
1280
1281Another way is to use the Tie::File module, which treats the entire
1282file as an array. Simply access a random array element.
1283
1284=head2 Why do I get weird spaces when I print an array of lines?
1285
1286Saying
1287
1288 print "@lines\n";
1289
1290joins together the elements of C<@lines> with a space between them.
1291If C<@lines> were C<("little", "fluffy", "clouds")> then the above
1292statement would print
1293
1294 little fluffy clouds
1295
1296but if each element of C<@lines> was a line of text, ending a newline
1297character C<("little\n", "fluffy\n", "clouds\n")> then it would print:
1298
1299 little
1300 fluffy
1301 clouds
1302
1303If your array contains lines, just print them:
1304
1305 print @lines;
1306
1307=head1 REVISION
1308
1309Revision: $Revision: 10126 $
1310
1311Date: $Date: 2007-10-27 21:29:20 +0200 (Sat, 27 Oct 2007) $
1312
1313See L<perlfaq> for source control details and availability.
1314
1315=head1 AUTHOR AND COPYRIGHT
1316
1317Copyright (c) 1997-2007 Tom Christiansen, Nathan Torkington, and
1318other authors as noted. All rights reserved.
1319
1320This documentation is free; you can redistribute it and/or modify it
1321under the same terms as Perl itself.
1322
1323Irrespective of its distribution, all code examples here are in the public
1324domain. You are permitted and encouraged to use this code and any
1325derivatives thereof in your own programs for fun or for profit as you
1326see fit. A simple comment in the code giving credit to the FAQ would
1327be courteous but is not required.