This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
More doc fixes from Abigail.
[perl5.git] / pod / perlfaq5.pod
CommitLineData
68dc0745
PP
1=head1 NAME
2
c8db1d39 3perlfaq5 - Files and Formats ($Revision: 1.24 $, $Date: 1998/07/05 15:07:20 $)
68dc0745
PP
4
5=head1 DESCRIPTION
6
7This section deals with I/O and the "f" issues: filehandles, flushing,
8formats, and footers.
9
5a964f20 10=head2 How do I flush/unbuffer an output filehandle? Why must I do this?
68dc0745
PP
11
12The C standard I/O library (stdio) normally buffers characters sent to
13devices. This is done for efficiency reasons, so that there isn't a
14system call for each byte. Any time you use print() or write() in
15Perl, you go though this buffering. syswrite() circumvents stdio and
16buffering.
17
5a964f20 18In most stdio implementations, the type of output buffering and the size of
68dc0745
PP
19the buffer varies according to the type of device. Disk files are block
20buffered, often with a buffer size of more than 2k. Pipes and sockets
21are often buffered with a buffer size between 1/2 and 2k. Serial devices
22(e.g. modems, terminals) are normally line-buffered, and stdio sends
23the entire line when it gets the newline.
24
25Perl does not support truly unbuffered output (except insofar as you can
26C<syswrite(OUT, $char, 1)>). What it does instead support is "command
27buffering", in which a physical write is performed after every output
28command. This isn't as hard on your system as unbuffering, but does
29get the output where you want it when you want it.
30
31If you expect characters to get to your device when you print them there,
5a964f20
TC
32you'll want to autoflush its handle.
33Use select() and the C<$|> variable to control autoflushing
34(see L<perlvar/$|> and L<perlfunc/select>):
35
36 $old_fh = select(OUTPUT_HANDLE);
37 $| = 1;
38 select($old_fh);
39
40Or using the traditional idiom:
41
42 select((select(OUTPUT_HANDLE), $| = 1)[0]);
43
44Or if don't mind slowly loading several thousand lines of module code
45just because you're afraid of the C<$|> variable:
68dc0745
PP
46
47 use FileHandle;
5a964f20 48 open(DEV, "+</dev/tty"); # ceci n'est pas une pipe
68dc0745
PP
49 DEV->autoflush(1);
50
51or the newer IO::* modules:
52
53 use IO::Handle;
54 open(DEV, ">/dev/printer"); # but is this?
55 DEV->autoflush(1);
56
57or even this:
58
59 use IO::Socket; # this one is kinda a pipe?
60 $sock = IO::Socket::INET->new(PeerAddr => 'www.perl.com',
61 PeerPort => 'http(80)',
62 Proto => 'tcp');
63 die "$!" unless $sock;
64
65 $sock->autoflush();
5a964f20
TC
66 print $sock "GET / HTTP/1.0" . "\015\012" x 2;
67 $document = join('', <$sock>);
68dc0745
PP
68 print "DOC IS: $document\n";
69
5a964f20
TC
70Note the bizarrely hardcoded carriage return and newline in their octal
71equivalents. This is the ONLY way (currently) to assure a proper flush
72on all platforms, including Macintosh. That the way things work in
73network programming: you really should specify the exact bit pattern
74on the network line terminator. In practice, C<"\n\n"> often works,
75but this is not portable.
68dc0745 76
5a964f20 77See L<perlfaq9> for other examples of fetching URLs over the web.
68dc0745
PP
78
79=head2 How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file?
80
81Although humans have an easy time thinking of a text file as being a
82sequence of lines that operates much like a stack of playing cards --
83or punch cards -- computers usually see the text file as a sequence of
84bytes. In general, there's no direct way for Perl to seek to a
85particular line of a file, insert text into a file, or remove text
86from a file.
87
5a964f20
TC
88(There are exceptions in special circumstances. You can add or remove at
89the very end of the file. Another is replacing a sequence of bytes with
90another sequence of the same length. Another is using the C<$DB_RECNO>
91array bindings as documented in L<DB_File>. Yet another is manipulating
92files with all lines the same length.)
68dc0745
PP
93
94The general solution is to create a temporary copy of the text file with
5a964f20
TC
95the changes you want, then copy that over the original. This assumes
96no locking.
68dc0745
PP
97
98 $old = $file;
99 $new = "$file.tmp.$$";
100 $bak = "$file.bak";
101
102 open(OLD, "< $old") or die "can't open $old: $!";
103 open(NEW, "> $new") or die "can't open $new: $!";
104
105 # Correct typos, preserving case
106 while (<OLD>) {
107 s/\b(p)earl\b/${1}erl/i;
108 (print NEW $_) or die "can't write to $new: $!";
109 }
110
111 close(OLD) or die "can't close $old: $!";
112 close(NEW) or die "can't close $new: $!";
113
114 rename($old, $bak) or die "can't rename $old to $bak: $!";
115 rename($new, $old) or die "can't rename $new to $old: $!";
116
117Perl can do this sort of thing for you automatically with the C<-i>
46fc3d4c 118command-line switch or the closely-related C<$^I> variable (see
68dc0745
PP
119L<perlrun> for more details). Note that
120C<-i> may require a suffix on some non-Unix systems; see the
121platform-specific documentation that came with your port.
122
123 # Renumber a series of tests from the command line
124 perl -pi -e 's/(^\s+test\s+)\d+/ $1 . ++$count /e' t/op/taint.t
125
126 # form a script
127 local($^I, @ARGV) = ('.bak', glob("*.c"));
128 while (<>) {
129 if ($. == 1) {
130 print "This line should appear at the top of each file\n";
131 }
132 s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
133 print;
134 close ARGV if eof; # Reset $.
135 }
136
137If you need to seek to an arbitrary line of a file that changes
138infrequently, you could build up an index of byte positions of where
139the line ends are in the file. If the file is large, an index of
140every tenth or hundredth line end would allow you to seek and read
141fairly efficiently. If the file is sorted, try the look.pl library
142(part of the standard perl distribution).
143
144In the unique case of deleting lines at the end of a file, you
145can use tell() and truncate(). The following code snippet deletes
146the last line of a file without making a copy or reading the
147whole file into memory:
148
149 open (FH, "+< $file");
54310121 150 while ( <FH> ) { $addr = tell(FH) unless eof(FH) }
68dc0745
PP
151 truncate(FH, $addr);
152
153Error checking is left as an exercise for the reader.
154
155=head2 How do I count the number of lines in a file?
156
157One fairly efficient way is to count newlines in the file. The
158following program uses a feature of tr///, as documented in L<perlop>.
159If your text file doesn't end with a newline, then it's not really a
160proper text file, so this may report one fewer line than you expect.
161
162 $lines = 0;
163 open(FILE, $filename) or die "Can't open `$filename': $!";
164 while (sysread FILE, $buffer, 4096) {
165 $lines += ($buffer =~ tr/\n//);
166 }
167 close FILE;
168
5a964f20
TC
169This assumes no funny games with newline translations.
170
68dc0745
PP
171=head2 How do I make a temporary file name?
172
5a964f20
TC
173Use the C<new_tmpfile> class method from the IO::File module to get a
174filehandle opened for reading and writing. Use this if you don't
175need to know the file's name.
68dc0745 176
68dc0745 177 use IO::File;
5a964f20
TC
178 $fh = IO::File->new_tmpfile()
179 or die "Unable to make new temporary file: $!";
180
181Or you can use the C<tmpnam> function from the POSIX module to get a
182filename that you then open yourself. Use this if you do need to know
183the file's name.
184
185 use Fcntl;
186 use POSIX qw(tmpnam);
187
188 # try new temporary filenames until we get one that didn't already
189 # exist; the check should be unnecessary, but you can't be too careful
190 do { $name = tmpnam() }
191 until sysopen(FH, $name, O_RDWR|O_CREAT|O_EXCL);
192
193 # install atexit-style handler so that when we exit or die,
194 # we automatically delete this temporary file
195 END { unlink($name) or die "Couldn't unlink $name : $!" }
196
197 # now go on to use the file ...
198
199If you're committed to doing this by hand, use the process ID and/or
200the current time-value. If you need to have many temporary files in
201one process, use a counter:
202
203 BEGIN {
68dc0745
PP
204 use Fcntl;
205 my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMP} || $ENV{TEMP};
206 my $base_name = sprintf("%s/%d-%d-0000", $temp_dir, $$, time());
207 sub temp_file {
5a964f20 208 local *FH;
68dc0745 209 my $count = 0;
5a964f20 210 until (defined(fileno(FH)) || $count++ > 100) {
68dc0745 211 $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
5a964f20 212 sysopen(FH, $base_name, O_WRONLY|O_EXCL|O_CREAT);
68dc0745 213 }
5a964f20
TC
214 if (defined(fileno(FH))
215 return (*FH, $base_name);
68dc0745
PP
216 } else {
217 return ();
218 }
219 }
220 }
221
68dc0745
PP
222=head2 How can I manipulate fixed-record-length files?
223
5a964f20
TC
224The most efficient way is using pack() and unpack(). This is faster than
225using substr() when take many, many strings. It is slower for just a few.
226
227Here is a sample chunk of code to break up and put back together again
228some fixed-format input lines, in this case from the output of a normal,
229Berkeley-style ps:
68dc0745
PP
230
231 # sample input line:
232 # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
233 $PS_T = 'A6 A4 A7 A5 A*';
234 open(PS, "ps|");
5a964f20 235 print scalar <PS>;
68dc0745
PP
236 while (<PS>) {
237 ($pid, $tt, $stat, $time, $command) = unpack($PS_T, $_);
238 for $var (qw!pid tt stat time command!) {
239 print "$var: <$$var>\n";
240 }
241 print 'line=', pack($PS_T, $pid, $tt, $stat, $time, $command),
242 "\n";
243 }
244
5a964f20
TC
245We've used C<$$var> in a way that forbidden by C<use strict 'refs'>.
246That is, we've promoted a string to a scalar variable reference using
247symbolic references. This is ok in small programs, but doesn't scale
248well. It also only works on global variables, not lexicals.
249
68dc0745
PP
250=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?
251
5a964f20
TC
252The fastest, simplest, and most direct way is to localize the typeglob
253of the filehandle in question:
68dc0745 254
5a964f20 255 local *TmpHandle;
68dc0745 256
5a964f20
TC
257Typeglobs are fast (especially compared with the alternatives) and
258reasonably easy to use, but they also have one subtle drawback. If you
259had, for example, a function named TmpHandle(), or a variable named
260%TmpHandle, you just hid it from yourself.
68dc0745 261
68dc0745 262 sub findme {
5a964f20
TC
263 local *HostFile;
264 open(HostFile, "</etc/hosts") or die "no /etc/hosts: $!";
265 local $_; # <- VERY IMPORTANT
266 while (<HostFile>) {
68dc0745
PP
267 print if /\b127\.(0\.0\.)?1\b/;
268 }
5a964f20
TC
269 # *HostFile automatically closes/disappears here
270 }
271
272Here's how to use this in a loop to open and store a bunch of
273filehandles. We'll use as values of the hash an ordered
274pair to make it easy to sort the hash in insertion order.
275
276 @names = qw(motd termcap passwd hosts);
277 my $i = 0;
278 foreach $filename (@names) {
279 local *FH;
280 open(FH, "/etc/$filename") || die "$filename: $!";
281 $file{$filename} = [ $i++, *FH ];
68dc0745
PP
282 }
283
5a964f20
TC
284 # Using the filehandles in the array
285 foreach $name (sort { $file{$a}[0] <=> $file{$b}[0] } keys %file) {
286 my $fh = $file{$name}[1];
287 my $line = <$fh>;
288 print "$name $. $line";
289 }
290
c8db1d39 291For passing filehandles to functions, the easiest way is to
b687b08b
TC
292prefer them with a star, as in func(*STDIN).
293See L<perlfaq7/"Passing Filehandles"> for details.
c8db1d39 294
5a964f20
TC
295If you want to create many, anonymous handles, you should check out the
296Symbol, FileHandle, or IO::Handle (etc.) modules. Here's the equivalent
297code with Symbol::gensym, which is reasonably light-weight:
298
299 foreach $filename (@names) {
300 use Symbol;
301 my $fh = gensym();
302 open($fh, "/etc/$filename") || die "open /etc/$filename: $!";
303 $file{$filename} = [ $i++, $fh ];
304 }
68dc0745 305
5a964f20
TC
306Or here using the semi-object-oriented FileHandle, which certainly isn't
307light-weight:
46fc3d4c
PP
308
309 use FileHandle;
310
46fc3d4c 311 foreach $filename (@names) {
5a964f20
TC
312 my $fh = FileHandle->new("/etc/$filename") or die "$filename: $!";
313 $file{$filename} = [ $i++, $fh ];
46fc3d4c
PP
314 }
315
5a964f20
TC
316Please understand that whether the filehandle happens to be a (probably
317localized) typeglob or an anonymous handle from one of the modules,
318in no way affects the bizarre rules for managing indirect handles.
319See the next question.
320
321=head2 How can I use a filehandle indirectly?
322
323An indirect filehandle is using something other than a symbol
324in a place that a filehandle is expected. Here are ways
325to get those:
326
327 $fh = SOME_FH; # bareword is strict-subs hostile
328 $fh = "SOME_FH"; # strict-refs hostile; same package only
329 $fh = *SOME_FH; # typeglob
330 $fh = \*SOME_FH; # ref to typeglob (bless-able)
331 $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob
332
333Or to use the C<new> method from the FileHandle or IO modules to
334create an anonymous filehandle, store that in a scalar variable,
335and use it as though it were a normal filehandle.
336
337 use FileHandle;
338 $fh = FileHandle->new();
339
340 use IO::Handle; # 5.004 or higher
341 $fh = IO::Handle->new();
342
343Then use any of those as you would a normal filehandle. Anywhere that
344Perl is expecting a filehandle, an indirect filehandle may be used
345instead. An indirect filehandle is just a scalar variable that contains
368c9434
JT
346a filehandle. Functions like C<print>, C<open>, C<seek>, or
347the C<E<lt>FHE<gt>> diamond operator will accept either a real filehandle
5a964f20
TC
348or a scalar variable containing one:
349
350 ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
351 print $ofh "Type it: ";
352 $got = <$ifh>
353 print $efh "What was that: $got";
354
368c9434 355If you're passing a filehandle to a function, you can write
5a964f20
TC
356the function in two ways:
357
358 sub accept_fh {
359 my $fh = shift;
360 print $fh "Sending to indirect filehandle\n";
46fc3d4c
PP
361 }
362
5a964f20 363Or it can localize a typeglob and use the filehandle directly:
46fc3d4c 364
5a964f20
TC
365 sub accept_fh {
366 local *FH = shift;
367 print FH "Sending to localized filehandle\n";
46fc3d4c
PP
368 }
369
5a964f20
TC
370Both styles work with either objects or typeglobs of real filehandles.
371(They might also work with strings under some circumstances, but this
372is risky.)
373
374 accept_fh(*STDOUT);
375 accept_fh($handle);
376
377In the examples above, we assigned the filehandle to a scalar variable
378before using it. That is because only simple scalar variables,
379not expressions or subscripts into hashes or arrays, can be used with
380built-ins like C<print>, C<printf>, or the diamond operator. These are
381illegal and won't even compile:
382
383 @fd = (*STDIN, *STDOUT, *STDERR);
384 print $fd[1] "Type it: "; # WRONG
385 $got = <$fd[0]> # WRONG
386 print $fd[2] "What was that: $got"; # WRONG
387
388With C<print> and C<printf>, you get around this by using a block and
389an expression where you would place the filehandle:
390
391 print { $fd[1] } "funny stuff\n";
392 printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559;
393 # Pity the poor deadbeef.
394
395That block is a proper block like any other, so you can put more
396complicated code there. This sends the message out to one of two places:
397
398 $ok = -x "/bin/cat";
399 print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
400 print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
401
402This approach of treating C<print> and C<printf> like object methods
403calls doesn't work for the diamond operator. That's because it's a
404real operator, not just a function with a comma-less argument. Assuming
405you've been storing typeglobs in your structure as we did above, you
406can use the built-in function named C<readline> to reads a record just
407as C<E<lt>E<gt>> does. Given the initialization shown above for @fd, this
408would work, but only because readline() require a typeglob. It doesn't
409work with objects or strings, which might be a bug we haven't fixed yet.
410
411 $got = readline($fd[0]);
412
413Let it be noted that the flakiness of indirect filehandles is not
414related to whether they're strings, typeglobs, objects, or anything else.
415It's the syntax of the fundamental operators. Playing the object
416game doesn't help you at all here.
46fc3d4c 417
68dc0745
PP
418=head2 How can I set up a footer format to be used with write()?
419
54310121 420There's no builtin way to do this, but L<perlform> has a couple of
68dc0745
PP
421techniques to make it possible for the intrepid hacker.
422
423=head2 How can I write() into a string?
424
425See L<perlform> for an swrite() function.
426
427=head2 How can I output my numbers with commas added?
428
429This one will do it for you:
430
431 sub commify {
432 local $_ = shift;
433 1 while s/^(-?\d+)(\d{3})/$1,$2/;
434 return $_;
435 }
436
437 $n = 23659019423.2331;
438 print "GOT: ", commify($n), "\n";
439
440 GOT: 23,659,019,423.2331
441
442You can't just:
443
444 s/^(-?\d+)(\d{3})/$1,$2/g;
445
446because you have to put the comma in and then recalculate your
447position.
448
46fc3d4c
PP
449Alternatively, this commifies all numbers in a line regardless of
450whether they have decimal portions, are preceded by + or -, or
451whatever:
452
453 # from Andrew Johnson <ajohnson@gpu.srv.ualberta.ca>
454 sub commify {
455 my $input = shift;
456 $input = reverse $input;
457 $input =~ s<(\d\d\d)(?=\d)(?!\d*\.)><$1,>g;
458 return reverse $input;
459 }
460
68dc0745
PP
461=head2 How can I translate tildes (~) in a filename?
462
463Use the E<lt>E<gt> (glob()) operator, documented in L<perlfunc>. This
464requires that you have a shell installed that groks tildes, meaning
465csh or tcsh or (some versions of) ksh, and thus may have portability
466problems. The Glob::KGlob module (available from CPAN) gives more
467portable glob functionality.
468
469Within Perl, you may use this directly:
470
471 $filename =~ s{
472 ^ ~ # find a leading tilde
473 ( # save this in $1
474 [^/] # a non-slash character
475 * # repeated 0 or more times (0 means me)
476 )
477 }{
478 $1
479 ? (getpwnam($1))[7]
480 : ( $ENV{HOME} || $ENV{LOGDIR} )
481 }ex;
482
5a964f20 483=head2 How come when I open a file read-write it wipes it out?
68dc0745
PP
484
485Because you're using something like this, which truncates the file and
486I<then> gives you read-write access:
487
5a964f20 488 open(FH, "+> /path/name"); # WRONG (almost always)
68dc0745
PP
489
490Whoops. You should instead use this, which will fail if the file
5a964f20
TC
491doesn't exist. Using "E<gt>" always clobbers or creates.
492Using "E<lt>" never does either. The "+" doesn't change this.
68dc0745 493
5a964f20
TC
494Here are examples of many kinds of file opens. Those using sysopen()
495all assume
68dc0745 496
5a964f20 497 use Fcntl;
68dc0745 498
5a964f20 499To open file for reading:
68dc0745 500
5a964f20
TC
501 open(FH, "< $path") || die $!;
502 sysopen(FH, $path, O_RDONLY) || die $!;
503
504To open file for writing, create new file if needed or else truncate old file:
505
506 open(FH, "> $path") || die $!;
507 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT) || die $!;
508 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666) || die $!;
509
510To open file for writing, create new file, file must not exist:
511
512 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT) || die $!;
513 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT, 0666) || die $!;
514
515To open file for appending, create if necessary:
516
517 open(FH, ">> $path") || die $!;
518 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT) || die $!;
519 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT, 0666) || die $!;
520
521To open file for appending, file must exist:
522
523 sysopen(FH, $path, O_WRONLY|O_APPEND) || die $!;
524
525To open file for update, file must exist:
526
527 open(FH, "+< $path") || die $!;
528 sysopen(FH, $path, O_RDWR) || die $!;
529
530To open file for update, create file if necessary:
531
532 sysopen(FH, $path, O_RDWR|O_CREAT) || die $!;
533 sysopen(FH, $path, O_RDWR|O_CREAT, 0666) || die $!;
534
535To open file for update, file must not exist:
536
537 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT) || die $!;
538 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT, 0666) || die $!;
539
540To open a file without blocking, creating if necessary:
541
542 sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT)
543 or die "can't open /tmp/somefile: $!":
544
545Be warned that neither creation nor deletion of files is guaranteed to
546be an atomic operation over NFS. That is, two processes might both
547successful create or unlink the same file! Therefore O_EXCL
548isn't so exclusive as you might wish.
68dc0745
PP
549
550=head2 Why do I sometimes get an "Argument list too long" when I use <*>?
551
552The C<E<lt>E<gt>> operator performs a globbing operation (see above).
553By default glob() forks csh(1) to do the actual glob expansion, but
554csh can't handle more than 127 items and so gives the error message
555C<Argument list too long>. People who installed tcsh as csh won't
556have this problem, but their users may be surprised by it.
557
558To get around this, either do the glob yourself with C<Dirhandle>s and
559patterns, or use a module like Glob::KGlob, one that doesn't use the
560shell to do globbing.
561
562=head2 Is there a leak/bug in glob()?
563
564Due to the current implementation on some operating systems, when you
565use the glob() function or its angle-bracket alias in a scalar
566context, you may cause a leak and/or unpredictable behavior. It's
567best therefore to use glob() only in list context.
568
569=head2 How can I open a file with a leading "E<gt>" or trailing blanks?
570
571Normally perl ignores trailing blanks in filenames, and interprets
572certain leading characters (or a trailing "|") to mean something
573special. To avoid this, you might want to use a routine like this.
574It makes incomplete pathnames into explicit relative ones, and tacks a
575trailing null byte on the name to make perl leave it alone:
576
577 sub safe_filename {
578 local $_ = shift;
579 return m#^/#
580 ? "$_\0"
581 : "./$_\0";
582 }
583
584 $fn = safe_filename("<<<something really wicked ");
585 open(FH, "> $fn") or "couldn't open $fn: $!";
586
587You could also use the sysopen() function (see L<perlfunc/sysopen>).
588
589=head2 How can I reliably rename a file?
590
591Well, usually you just use Perl's rename() function. But that may
592not work everywhere, in particular, renaming files across file systems.
593If your operating system supports a mv(1) program or its moral equivalent,
594this works:
595
596 rename($old, $new) or system("mv", $old, $new);
597
598It may be more compelling to use the File::Copy module instead. You
599just copy to the new file to the new name (checking return values),
600then delete the old one. This isn't really the same semantics as a
601real rename(), though, which preserves metainformation like
602permissions, timestamps, inode info, etc.
603
5a964f20
TC
604The newer version of File::Copy export a move() function.
605
68dc0745
PP
606=head2 How can I lock a file?
607
54310121 608Perl's builtin flock() function (see L<perlfunc> for details) will call
68dc0745
PP
609flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and
610later), and lockf(3) if neither of the two previous system calls exists.
611On some systems, it may even use a different form of native locking.
612Here are some gotchas with Perl's flock():
613
614=over 4
615
616=item 1
617
618Produces a fatal error if none of the three system calls (or their
619close equivalent) exists.
620
621=item 2
622
623lockf(3) does not provide shared locking, and requires that the
624filehandle be open for writing (or appending, or read/writing).
625
626=item 3
627
628Some versions of flock() can't lock files over a network (e.g. on NFS
629file systems), so you'd need to force the use of fcntl(2) when you
630build Perl. See the flock entry of L<perlfunc>, and the F<INSTALL>
631file in the source distribution for information on building Perl to do
632this.
633
634=back
635
68dc0745
PP
636=head2 What can't I just open(FH, ">file.lock")?
637
638A common bit of code B<NOT TO USE> is this:
639
640 sleep(3) while -e "file.lock"; # PLEASE DO NOT USE
641 open(LCK, "> file.lock"); # THIS BROKEN CODE
642
643This is a classic race condition: you take two steps to do something
644which must be done in one. That's why computer hardware provides an
645atomic test-and-set instruction. In theory, this "ought" to work:
646
5a964f20 647 sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT)
68dc0745
PP
648 or die "can't open file.lock: $!":
649
650except that lamentably, file creation (and deletion) is not atomic
651over NFS, so this won't work (at least, not every time) over the net.
46fc3d4c
PP
652Various schemes involving involving link() have been suggested, but
653these tend to involve busy-wait, which is also subdesirable.
68dc0745 654
fc36a67e 655=head2 I still don't get locking. I just want to increment the number in the file. How can I do this?
68dc0745 656
46fc3d4c 657Didn't anyone ever tell you web-page hit counters were useless?
5a964f20
TC
658They don't count number of hits, they're a waste of time, and they serve
659only to stroke the writer's vanity. Better to pick a random number.
660It's more realistic.
68dc0745 661
5a964f20 662Anyway, this is what you can do if you can't help yourself.
68dc0745
PP
663
664 use Fcntl;
5a964f20 665 sysopen(FH, "numfile", O_RDWR|O_CREAT) or die "can't open numfile: $!";
68dc0745
PP
666 flock(FH, 2) or die "can't flock numfile: $!";
667 $num = <FH> || 0;
668 seek(FH, 0, 0) or die "can't rewind numfile: $!";
669 truncate(FH, 0) or die "can't truncate numfile: $!";
670 (print FH $num+1, "\n") or die "can't write numfile: $!";
671 # DO NOT UNLOCK THIS UNTIL YOU CLOSE
672 close FH or die "can't close numfile: $!";
673
46fc3d4c 674Here's a much better web-page hit counter:
68dc0745
PP
675
676 $hits = int( (time() - 850_000_000) / rand(1_000) );
677
678If the count doesn't impress your friends, then the code might. :-)
679
680=head2 How do I randomly update a binary file?
681
682If you're just trying to patch a binary, in many cases something as
683simple as this works:
684
685 perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs
686
687However, if you have fixed sized records, then you might do something more
688like this:
689
690 $RECSIZE = 220; # size of record, in bytes
691 $recno = 37; # which record to update
692 open(FH, "+<somewhere") || die "can't update somewhere: $!";
693 seek(FH, $recno * $RECSIZE, 0);
694 read(FH, $record, $RECSIZE) == $RECSIZE || die "can't read record $recno: $!";
695 # munge the record
696 seek(FH, $recno * $RECSIZE, 0);
697 print FH $record;
698 close FH;
699
700Locking and error checking are left as an exercise for the reader.
701Don't forget them, or you'll be quite sorry.
702
68dc0745
PP
703=head2 How do I get a file's timestamp in perl?
704
705If you want to retrieve the time at which the file was last read,
46fc3d4c 706written, or had its meta-data (owner, etc) changed, you use the B<-M>,
68dc0745
PP
707B<-A>, or B<-C> filetest operations as documented in L<perlfunc>. These
708retrieve the age of the file (measured against the start-time of your
709program) in days as a floating point number. To retrieve the "raw"
710time in seconds since the epoch, you would call the stat function,
711then use localtime(), gmtime(), or POSIX::strftime() to convert this
712into human-readable form.
713
714Here's an example:
715
716 $write_secs = (stat($file))[9];
c8db1d39
TC
717 printf "file %s updated at %s\n", $file,
718 scalar localtime($write_secs);
68dc0745
PP
719
720If you prefer something more legible, use the File::stat module
721(part of the standard distribution in version 5.004 and later):
722
723 use File::stat;
724 use Time::localtime;
725 $date_string = ctime(stat($file)->mtime);
726 print "file $file updated at $date_string\n";
727
728Error checking is left as an exercise for the reader.
729
730=head2 How do I set a file's timestamp in perl?
731
732You use the utime() function documented in L<perlfunc/utime>.
733By way of example, here's a little program that copies the
734read and write times from its first argument to all the rest
735of them.
736
737 if (@ARGV < 2) {
738 die "usage: cptimes timestamp_file other_files ...\n";
739 }
740 $timestamp = shift;
741 ($atime, $mtime) = (stat($timestamp))[8,9];
742 utime $atime, $mtime, @ARGV;
743
744Error checking is left as an exercise for the reader.
745
746Note that utime() currently doesn't work correctly with Win95/NT
747ports. A bug has been reported. Check it carefully before using
748it on those platforms.
749
750=head2 How do I print to more than one file at once?
751
752If you only have to do this once, you can do this:
753
754 for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
755
756To connect up to one filehandle to several output filehandles, it's
757easiest to use the tee(1) program if you have it, and let it take care
758of the multiplexing:
759
760 open (FH, "| tee file1 file2 file3");
761
5a964f20
TC
762Or even:
763
764 # make STDOUT go to three files, plus original STDOUT
765 open (STDOUT, "| tee file1 file2 file3") or die "Teeing off: $!\n";
766 print "whatever\n" or die "Writing: $!\n";
767 close(STDOUT) or die "Closing: $!\n";
68dc0745 768
5a964f20
TC
769Otherwise you'll have to write your own multiplexing print
770function -- or your own tee program -- or use Tom Christiansen's,
771at http://www.perl.com/CPAN/authors/id/TOMC/scripts/tct.gz, which is
772written in Perl and offers much greater functionality
773than the stock version.
68dc0745
PP
774
775=head2 How can I read in a file by paragraphs?
776
777Use the C<$\> variable (see L<perlvar> for details). You can either
778set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">,
779for instance, gets treated as two paragraphs and not three), or
780C<"\n\n"> to accept empty paragraphs.
781
782=head2 How can I read a single character from a file? From the keyboard?
783
784You can use the builtin C<getc()> function for most filehandles, but
785it won't (easily) work on a terminal device. For STDIN, either use
786the Term::ReadKey module from CPAN, or use the sample code in
787L<perlfunc/getc>.
788
789If your system supports POSIX, you can use the following code, which
790you'll note turns off echo processing as well.
791
792 #!/usr/bin/perl -w
793 use strict;
794 $| = 1;
795 for (1..4) {
796 my $got;
797 print "gimme: ";
798 $got = getone();
799 print "--> $got\n";
800 }
801 exit;
802
803 BEGIN {
804 use POSIX qw(:termios_h);
805
806 my ($term, $oterm, $echo, $noecho, $fd_stdin);
807
808 $fd_stdin = fileno(STDIN);
809
810 $term = POSIX::Termios->new();
811 $term->getattr($fd_stdin);
812 $oterm = $term->getlflag();
813
814 $echo = ECHO | ECHOK | ICANON;
815 $noecho = $oterm & ~$echo;
816
817 sub cbreak {
818 $term->setlflag($noecho);
819 $term->setcc(VTIME, 1);
820 $term->setattr($fd_stdin, TCSANOW);
821 }
822
823 sub cooked {
824 $term->setlflag($oterm);
825 $term->setcc(VTIME, 0);
826 $term->setattr($fd_stdin, TCSANOW);
827 }
828
829 sub getone {
830 my $key = '';
831 cbreak();
832 sysread(STDIN, $key, 1);
833 cooked();
834 return $key;
835 }
836
837 }
838
839 END { cooked() }
840
841The Term::ReadKey module from CPAN may be easier to use:
842
843 use Term::ReadKey;
844 open(TTY, "</dev/tty");
845 print "Gimme a char: ";
846 ReadMode "raw";
847 $key = ReadKey 0, *TTY;
848 ReadMode "normal";
849 printf "\nYou said %s, char number %03d\n",
850 $key, ord $key;
851
46fc3d4c 852For DOS systems, Dan Carson <dbc@tc.fluke.COM> reports the following:
68dc0745
PP
853
854To put the PC in "raw" mode, use ioctl with some magic numbers gleaned
855from msdos.c (Perl source file) and Ralf Brown's interrupt list (comes
856across the net every so often):
857
858 $old_ioctl = ioctl(STDIN,0,0); # Gets device info
859 $old_ioctl &= 0xff;
860 ioctl(STDIN,1,$old_ioctl | 32); # Writes it back, setting bit 5
861
862Then to read a single character:
863
864 sysread(STDIN,$c,1); # Read a single character
865
866And to put the PC back to "cooked" mode:
867
868 ioctl(STDIN,1,$old_ioctl); # Sets it back to cooked mode.
869
870So now you have $c. If C<ord($c) == 0>, you have a two byte code, which
871means you hit a special key. Read another byte with C<sysread(STDIN,$c,1)>,
872and that value tells you what combination it was according to this
873table:
874
875 # PC 2-byte keycodes = ^@ + the following:
876
877 # HEX KEYS
878 # --- ----
879 # 0F SHF TAB
880 # 10-19 ALT QWERTYUIOP
881 # 1E-26 ALT ASDFGHJKL
882 # 2C-32 ALT ZXCVBNM
883 # 3B-44 F1-F10
884 # 47-49 HOME,UP,PgUp
885 # 4B LEFT
886 # 4D RIGHT
887 # 4F-53 END,DOWN,PgDn,Ins,Del
888 # 54-5D SHF F1-F10
889 # 5E-67 CTR F1-F10
890 # 68-71 ALT F1-F10
891 # 73-77 CTR LEFT,RIGHT,END,PgDn,HOME
892 # 78-83 ALT 1234567890-=
893 # 84 CTR PgUp
894
895This is all trial and error I did a long time ago, I hope I'm reading the
896file that worked.
897
898=head2 How can I tell if there's a character waiting on a filehandle?
899
5a964f20
TC
900The very first thing you should do is look into getting the Term::ReadKey
901extension from CPAN. It now even has limited support for closed, proprietary
902(read: not open systems, not POSIX, not Unix, etc) systems.
903
904You should also check out the Frequently Asked Questions list in
68dc0745
PP
905comp.unix.* for things like this: the answer is essentially the same.
906It's very system dependent. Here's one solution that works on BSD
907systems:
908
909 sub key_ready {
910 my($rin, $nfd);
911 vec($rin, fileno(STDIN), 1) = 1;
912 return $nfd = select($rin,undef,undef,0);
913 }
914
5a964f20
TC
915If you want to find out how many characters are waiting,
916there's also the FIONREAD ioctl call to be looked at.
68dc0745 917
5a964f20
TC
918The I<h2ph> tool that comes with Perl tries to convert C include
919files to Perl code, which can be C<require>d. FIONREAD ends
920up defined as a function in the I<sys/ioctl.ph> file:
68dc0745 921
5a964f20 922 require 'sys/ioctl.ph';
68dc0745 923
5a964f20
TC
924 $size = pack("L", 0);
925 ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n";
926 $size = unpack("L", $size);
68dc0745 927
5a964f20
TC
928If I<h2ph> wasn't installed or doesn't work for you, you can
929I<grep> the include files by hand:
68dc0745 930
5a964f20
TC
931 % grep FIONREAD /usr/include/*/*
932 /usr/include/asm/ioctls.h:#define FIONREAD 0x541B
68dc0745 933
5a964f20 934Or write a small C program using the editor of champions:
68dc0745 935
5a964f20
TC
936 % cat > fionread.c
937 #include <sys/ioctl.h>
938 main() {
939 printf("%#08x\n", FIONREAD);
940 }
941 ^D
942 % cc -o fionread fionread
943 % ./fionread
944 0x4004667f
945
946And then hard-code it, leaving porting as an exercise to your successor.
947
948 $FIONREAD = 0x4004667f; # XXX: opsys dependent
949
950 $size = pack("L", 0);
951 ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n";
952 $size = unpack("L", $size);
953
954FIONREAD requires a filehandle connected to a stream, meaning sockets,
955pipes, and tty devices work, but I<not> files.
68dc0745
PP
956
957=head2 How do I do a C<tail -f> in perl?
958
959First try
960
961 seek(GWFILE, 0, 1);
962
963The statement C<seek(GWFILE, 0, 1)> doesn't change the current position,
964but it does clear the end-of-file condition on the handle, so that the
965next <GWFILE> makes Perl try again to read something.
966
967If that doesn't work (it relies on features of your stdio implementation),
968then you need something more like this:
969
970 for (;;) {
971 for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) {
972 # search for some stuff and put it into files
973 }
974 # sleep for a while
975 seek(GWFILE, $curpos, 0); # seek to where we had been
976 }
977
978If this still doesn't work, look into the POSIX module. POSIX defines
979the clearerr() method, which can remove the end of file condition on a
980filehandle. The method: read until end of file, clearerr(), read some
981more. Lather, rinse, repeat.
982
983=head2 How do I dup() a filehandle in Perl?
984
985If you check L<perlfunc/open>, you'll see that several of the ways
986to call open() should do the trick. For example:
987
988 open(LOG, ">>/tmp/logfile");
989 open(STDERR, ">&LOG");
990
991Or even with a literal numeric descriptor:
992
993 $fd = $ENV{MHCONTEXTFD};
994 open(MHCONTEXT, "<&=$fd"); # like fdopen(3S)
995
5a964f20
TC
996Note that "E<lt>&STDIN" makes a copy, but "E<lt>&=STDIN" make
997an alias. That means if you close an aliased handle, all
998aliases become inaccessible. This is not true with
999a copied one.
1000
1001Error checking, as always, has been left as an exercise for the reader.
68dc0745
PP
1002
1003=head2 How do I close a file descriptor by number?
1004
1005This should rarely be necessary, as the Perl close() function is to be
1006used for things that Perl opened itself, even if it was a dup of a
1007numeric descriptor, as with MHCONTEXT above. But if you really have
1008to, you may be able to do this:
1009
1010 require 'sys/syscall.ph';
1011 $rc = syscall(&SYS_close, $fd + 0); # must force numeric
1012 die "can't sysclose $fd: $!" unless $rc == -1;
1013
46fc3d4c 1014=head2 Why can't I use "C:\temp\foo" in DOS paths? What doesn't `C:\temp\foo.exe` work?
68dc0745
PP
1015
1016Whoops! You just put a tab and a formfeed into that filename!
1017Remember that within double quoted strings ("like\this"), the
1018backslash is an escape character. The full list of these is in
1019L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't
1020have a file called "c:(tab)emp(formfeed)oo" or
46fc3d4c 1021"c:(tab)emp(formfeed)oo.exe" on your DOS filesystem.
68dc0745
PP
1022
1023Either single-quote your strings, or (preferably) use forward slashes.
46fc3d4c 1024Since all DOS and Windows versions since something like MS-DOS 2.0 or so
68dc0745
PP
1025have treated C</> and C<\> the same in a path, you might as well use the
1026one that doesn't clash with Perl -- or the POSIX shell, ANSI C and C++,
1027awk, Tcl, Java, or Python, just to mention a few.
1028
1029=head2 Why doesn't glob("*.*") get all the files?
1030
1031Because even on non-Unix ports, Perl's glob function follows standard
46fc3d4c 1032Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden)
5a964f20 1033files. This makes glob() portable.
68dc0745
PP
1034
1035=head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl?
1036
1037This is elaborately and painstakingly described in the "Far More Than
7b8d334a 1038You Ever Wanted To Know" in
68dc0745
PP
1039http://www.perl.com/CPAN/doc/FMTEYEWTK/file-dir-perms .
1040
1041The executive summary: learn how your filesystem works. The
1042permissions on a file say what can happen to the data in that file.
1043The permissions on a directory say what can happen to the list of
1044files in that directory. If you delete a file, you're removing its
1045name from the directory (so the operation depends on the permissions
1046of the directory, not of the file). If you try to write to the file,
1047the permissions of the file govern whether you're allowed to.
1048
1049=head2 How do I select a random line from a file?
1050
1051Here's an algorithm from the Camel Book:
1052
1053 srand;
1054 rand($.) < 1 && ($line = $_) while <>;
1055
1056This has a significant advantage in space over reading the whole
5a964f20
TC
1057file in. A simple proof by induction is available upon
1058request if you doubt its correctness.
68dc0745
PP
1059
1060=head1 AUTHOR AND COPYRIGHT
1061
5a964f20
TC
1062Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington.
1063All rights reserved.
1064
c8db1d39
TC
1065When included as an integrated part of the Standard Distribution
1066of Perl or of its documentation (printed or otherwise), this works is
1067covered under Perl's Artistic Licence. For separate distributions of
1068all or part of this FAQ outside of that, see L<perlfaq>.
1069
1070Irrespective of its distribution, all code examples here are public
1071domain. You are permitted and encouraged to use this code and any
1072derivatives thereof in your own programs for fun or for profit as you
1073see fit. A simple comment in the code giving credit to the FAQ would
1074be courteous but is not required.