This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
applied patch, and undid change#1302 which it made unnecessary
[perl5.git] / pod / perlfaq5.pod
CommitLineData
68dc0745 1=head1 NAME
2
fc36a67e 3perlfaq5 - Files and Formats ($Revision: 1.22 $, $Date: 1997/04/24 22:44:02 $)
68dc0745 4
5=head1 DESCRIPTION
6
7This section deals with I/O and the "f" issues: filehandles, flushing,
8formats, and footers.
9
5a964f20 10=head2 How do I flush/unbuffer an output filehandle? Why must I do this?
68dc0745 11
12The C standard I/O library (stdio) normally buffers characters sent to
13devices. This is done for efficiency reasons, so that there isn't a
14system call for each byte. Any time you use print() or write() in
15Perl, you go though this buffering. syswrite() circumvents stdio and
16buffering.
17
5a964f20 18In most stdio implementations, the type of output buffering and the size of
68dc0745 19the buffer varies according to the type of device. Disk files are block
20buffered, often with a buffer size of more than 2k. Pipes and sockets
21are often buffered with a buffer size between 1/2 and 2k. Serial devices
22(e.g. modems, terminals) are normally line-buffered, and stdio sends
23the entire line when it gets the newline.
24
25Perl does not support truly unbuffered output (except insofar as you can
26C<syswrite(OUT, $char, 1)>). What it does instead support is "command
27buffering", in which a physical write is performed after every output
28command. This isn't as hard on your system as unbuffering, but does
29get the output where you want it when you want it.
30
31If you expect characters to get to your device when you print them there,
5a964f20
TC
32you'll want to autoflush its handle.
33Use select() and the C<$|> variable to control autoflushing
34(see L<perlvar/$|> and L<perlfunc/select>):
35
36 $old_fh = select(OUTPUT_HANDLE);
37 $| = 1;
38 select($old_fh);
39
40Or using the traditional idiom:
41
42 select((select(OUTPUT_HANDLE), $| = 1)[0]);
43
44Or if don't mind slowly loading several thousand lines of module code
45just because you're afraid of the C<$|> variable:
68dc0745 46
47 use FileHandle;
5a964f20 48 open(DEV, "+</dev/tty"); # ceci n'est pas une pipe
68dc0745 49 DEV->autoflush(1);
50
51or the newer IO::* modules:
52
53 use IO::Handle;
54 open(DEV, ">/dev/printer"); # but is this?
55 DEV->autoflush(1);
56
57or even this:
58
59 use IO::Socket; # this one is kinda a pipe?
60 $sock = IO::Socket::INET->new(PeerAddr => 'www.perl.com',
61 PeerPort => 'http(80)',
62 Proto => 'tcp');
63 die "$!" unless $sock;
64
65 $sock->autoflush();
5a964f20
TC
66 print $sock "GET / HTTP/1.0" . "\015\012" x 2;
67 $document = join('', <$sock>);
68dc0745 68 print "DOC IS: $document\n";
69
5a964f20
TC
70Note the bizarrely hardcoded carriage return and newline in their octal
71equivalents. This is the ONLY way (currently) to assure a proper flush
72on all platforms, including Macintosh. That the way things work in
73network programming: you really should specify the exact bit pattern
74on the network line terminator. In practice, C<"\n\n"> often works,
75but this is not portable.
68dc0745 76
5a964f20 77See L<perlfaq9> for other examples of fetching URLs over the web.
68dc0745 78
79=head2 How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file?
80
81Although humans have an easy time thinking of a text file as being a
82sequence of lines that operates much like a stack of playing cards --
83or punch cards -- computers usually see the text file as a sequence of
84bytes. In general, there's no direct way for Perl to seek to a
85particular line of a file, insert text into a file, or remove text
86from a file.
87
5a964f20
TC
88(There are exceptions in special circumstances. You can add or remove at
89the very end of the file. Another is replacing a sequence of bytes with
90another sequence of the same length. Another is using the C<$DB_RECNO>
91array bindings as documented in L<DB_File>. Yet another is manipulating
92files with all lines the same length.)
68dc0745 93
94The general solution is to create a temporary copy of the text file with
5a964f20
TC
95the changes you want, then copy that over the original. This assumes
96no locking.
68dc0745 97
98 $old = $file;
99 $new = "$file.tmp.$$";
100 $bak = "$file.bak";
101
102 open(OLD, "< $old") or die "can't open $old: $!";
103 open(NEW, "> $new") or die "can't open $new: $!";
104
105 # Correct typos, preserving case
106 while (<OLD>) {
107 s/\b(p)earl\b/${1}erl/i;
108 (print NEW $_) or die "can't write to $new: $!";
109 }
110
111 close(OLD) or die "can't close $old: $!";
112 close(NEW) or die "can't close $new: $!";
113
114 rename($old, $bak) or die "can't rename $old to $bak: $!";
115 rename($new, $old) or die "can't rename $new to $old: $!";
116
117Perl can do this sort of thing for you automatically with the C<-i>
46fc3d4c 118command-line switch or the closely-related C<$^I> variable (see
68dc0745 119L<perlrun> for more details). Note that
120C<-i> may require a suffix on some non-Unix systems; see the
121platform-specific documentation that came with your port.
122
123 # Renumber a series of tests from the command line
124 perl -pi -e 's/(^\s+test\s+)\d+/ $1 . ++$count /e' t/op/taint.t
125
126 # form a script
127 local($^I, @ARGV) = ('.bak', glob("*.c"));
128 while (<>) {
129 if ($. == 1) {
130 print "This line should appear at the top of each file\n";
131 }
132 s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
133 print;
134 close ARGV if eof; # Reset $.
135 }
136
137If you need to seek to an arbitrary line of a file that changes
138infrequently, you could build up an index of byte positions of where
139the line ends are in the file. If the file is large, an index of
140every tenth or hundredth line end would allow you to seek and read
141fairly efficiently. If the file is sorted, try the look.pl library
142(part of the standard perl distribution).
143
144In the unique case of deleting lines at the end of a file, you
145can use tell() and truncate(). The following code snippet deletes
146the last line of a file without making a copy or reading the
147whole file into memory:
148
149 open (FH, "+< $file");
54310121 150 while ( <FH> ) { $addr = tell(FH) unless eof(FH) }
68dc0745 151 truncate(FH, $addr);
152
153Error checking is left as an exercise for the reader.
154
155=head2 How do I count the number of lines in a file?
156
157One fairly efficient way is to count newlines in the file. The
158following program uses a feature of tr///, as documented in L<perlop>.
159If your text file doesn't end with a newline, then it's not really a
160proper text file, so this may report one fewer line than you expect.
161
162 $lines = 0;
163 open(FILE, $filename) or die "Can't open `$filename': $!";
164 while (sysread FILE, $buffer, 4096) {
165 $lines += ($buffer =~ tr/\n//);
166 }
167 close FILE;
168
5a964f20
TC
169This assumes no funny games with newline translations.
170
68dc0745 171=head2 How do I make a temporary file name?
172
5a964f20
TC
173Use the C<new_tmpfile> class method from the IO::File module to get a
174filehandle opened for reading and writing. Use this if you don't
175need to know the file's name.
68dc0745 176
68dc0745 177 use IO::File;
5a964f20
TC
178 $fh = IO::File->new_tmpfile()
179 or die "Unable to make new temporary file: $!";
180
181Or you can use the C<tmpnam> function from the POSIX module to get a
182filename that you then open yourself. Use this if you do need to know
183the file's name.
184
185 use Fcntl;
186 use POSIX qw(tmpnam);
187
188 # try new temporary filenames until we get one that didn't already
189 # exist; the check should be unnecessary, but you can't be too careful
190 do { $name = tmpnam() }
191 until sysopen(FH, $name, O_RDWR|O_CREAT|O_EXCL);
192
193 # install atexit-style handler so that when we exit or die,
194 # we automatically delete this temporary file
195 END { unlink($name) or die "Couldn't unlink $name : $!" }
196
197 # now go on to use the file ...
198
199If you're committed to doing this by hand, use the process ID and/or
200the current time-value. If you need to have many temporary files in
201one process, use a counter:
202
203 BEGIN {
68dc0745 204 use Fcntl;
205 my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMP} || $ENV{TEMP};
206 my $base_name = sprintf("%s/%d-%d-0000", $temp_dir, $$, time());
207 sub temp_file {
5a964f20 208 local *FH;
68dc0745 209 my $count = 0;
5a964f20 210 until (defined(fileno(FH)) || $count++ > 100) {
68dc0745 211 $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
5a964f20 212 sysopen(FH, $base_name, O_WRONLY|O_EXCL|O_CREAT);
68dc0745 213 }
5a964f20
TC
214 if (defined(fileno(FH))
215 return (*FH, $base_name);
68dc0745 216 } else {
217 return ();
218 }
219 }
220 }
221
68dc0745 222=head2 How can I manipulate fixed-record-length files?
223
5a964f20
TC
224The most efficient way is using pack() and unpack(). This is faster than
225using substr() when take many, many strings. It is slower for just a few.
226
227Here is a sample chunk of code to break up and put back together again
228some fixed-format input lines, in this case from the output of a normal,
229Berkeley-style ps:
68dc0745 230
231 # sample input line:
232 # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
233 $PS_T = 'A6 A4 A7 A5 A*';
234 open(PS, "ps|");
5a964f20 235 print scalar <PS>;
68dc0745 236 while (<PS>) {
237 ($pid, $tt, $stat, $time, $command) = unpack($PS_T, $_);
238 for $var (qw!pid tt stat time command!) {
239 print "$var: <$$var>\n";
240 }
241 print 'line=', pack($PS_T, $pid, $tt, $stat, $time, $command),
242 "\n";
243 }
244
5a964f20
TC
245We've used C<$$var> in a way that forbidden by C<use strict 'refs'>.
246That is, we've promoted a string to a scalar variable reference using
247symbolic references. This is ok in small programs, but doesn't scale
248well. It also only works on global variables, not lexicals.
249
68dc0745 250=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?
251
5a964f20
TC
252The fastest, simplest, and most direct way is to localize the typeglob
253of the filehandle in question:
68dc0745 254
5a964f20 255 local *TmpHandle;
68dc0745 256
5a964f20
TC
257Typeglobs are fast (especially compared with the alternatives) and
258reasonably easy to use, but they also have one subtle drawback. If you
259had, for example, a function named TmpHandle(), or a variable named
260%TmpHandle, you just hid it from yourself.
68dc0745 261
68dc0745 262 sub findme {
5a964f20
TC
263 local *HostFile;
264 open(HostFile, "</etc/hosts") or die "no /etc/hosts: $!";
265 local $_; # <- VERY IMPORTANT
266 while (<HostFile>) {
68dc0745 267 print if /\b127\.(0\.0\.)?1\b/;
268 }
5a964f20
TC
269 # *HostFile automatically closes/disappears here
270 }
271
272Here's how to use this in a loop to open and store a bunch of
273filehandles. We'll use as values of the hash an ordered
274pair to make it easy to sort the hash in insertion order.
275
276 @names = qw(motd termcap passwd hosts);
277 my $i = 0;
278 foreach $filename (@names) {
279 local *FH;
280 open(FH, "/etc/$filename") || die "$filename: $!";
281 $file{$filename} = [ $i++, *FH ];
68dc0745 282 }
283
5a964f20
TC
284 # Using the filehandles in the array
285 foreach $name (sort { $file{$a}[0] <=> $file{$b}[0] } keys %file) {
286 my $fh = $file{$name}[1];
287 my $line = <$fh>;
288 print "$name $. $line";
289 }
290
291If you want to create many, anonymous handles, you should check out the
292Symbol, FileHandle, or IO::Handle (etc.) modules. Here's the equivalent
293code with Symbol::gensym, which is reasonably light-weight:
294
295 foreach $filename (@names) {
296 use Symbol;
297 my $fh = gensym();
298 open($fh, "/etc/$filename") || die "open /etc/$filename: $!";
299 $file{$filename} = [ $i++, $fh ];
300 }
68dc0745 301
5a964f20
TC
302Or here using the semi-object-oriented FileHandle, which certainly isn't
303light-weight:
46fc3d4c 304
305 use FileHandle;
306
46fc3d4c 307 foreach $filename (@names) {
5a964f20
TC
308 my $fh = FileHandle->new("/etc/$filename") or die "$filename: $!";
309 $file{$filename} = [ $i++, $fh ];
46fc3d4c 310 }
311
5a964f20
TC
312Please understand that whether the filehandle happens to be a (probably
313localized) typeglob or an anonymous handle from one of the modules,
314in no way affects the bizarre rules for managing indirect handles.
315See the next question.
316
317=head2 How can I use a filehandle indirectly?
318
319An indirect filehandle is using something other than a symbol
320in a place that a filehandle is expected. Here are ways
321to get those:
322
323 $fh = SOME_FH; # bareword is strict-subs hostile
324 $fh = "SOME_FH"; # strict-refs hostile; same package only
325 $fh = *SOME_FH; # typeglob
326 $fh = \*SOME_FH; # ref to typeglob (bless-able)
327 $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob
328
329Or to use the C<new> method from the FileHandle or IO modules to
330create an anonymous filehandle, store that in a scalar variable,
331and use it as though it were a normal filehandle.
332
333 use FileHandle;
334 $fh = FileHandle->new();
335
336 use IO::Handle; # 5.004 or higher
337 $fh = IO::Handle->new();
338
339Then use any of those as you would a normal filehandle. Anywhere that
340Perl is expecting a filehandle, an indirect filehandle may be used
341instead. An indirect filehandle is just a scalar variable that contains
342a filehandle. Functions like C<print>, C<open>, C<seek>, or the functions or
343the C<E<lt>FHE<gt>> diamond operator will accept either a read filehandle
344or a scalar variable containing one:
345
346 ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
347 print $ofh "Type it: ";
348 $got = <$ifh>
349 print $efh "What was that: $got";
350
351Of you're passing a filehandle to a function, you can write
352the function in two ways:
353
354 sub accept_fh {
355 my $fh = shift;
356 print $fh "Sending to indirect filehandle\n";
46fc3d4c 357 }
358
5a964f20 359Or it can localize a typeglob and use the filehandle directly:
46fc3d4c 360
5a964f20
TC
361 sub accept_fh {
362 local *FH = shift;
363 print FH "Sending to localized filehandle\n";
46fc3d4c 364 }
365
5a964f20
TC
366Both styles work with either objects or typeglobs of real filehandles.
367(They might also work with strings under some circumstances, but this
368is risky.)
369
370 accept_fh(*STDOUT);
371 accept_fh($handle);
372
373In the examples above, we assigned the filehandle to a scalar variable
374before using it. That is because only simple scalar variables,
375not expressions or subscripts into hashes or arrays, can be used with
376built-ins like C<print>, C<printf>, or the diamond operator. These are
377illegal and won't even compile:
378
379 @fd = (*STDIN, *STDOUT, *STDERR);
380 print $fd[1] "Type it: "; # WRONG
381 $got = <$fd[0]> # WRONG
382 print $fd[2] "What was that: $got"; # WRONG
383
384With C<print> and C<printf>, you get around this by using a block and
385an expression where you would place the filehandle:
386
387 print { $fd[1] } "funny stuff\n";
388 printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559;
389 # Pity the poor deadbeef.
390
391That block is a proper block like any other, so you can put more
392complicated code there. This sends the message out to one of two places:
393
394 $ok = -x "/bin/cat";
395 print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
396 print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
397
398This approach of treating C<print> and C<printf> like object methods
399calls doesn't work for the diamond operator. That's because it's a
400real operator, not just a function with a comma-less argument. Assuming
401you've been storing typeglobs in your structure as we did above, you
402can use the built-in function named C<readline> to reads a record just
403as C<E<lt>E<gt>> does. Given the initialization shown above for @fd, this
404would work, but only because readline() require a typeglob. It doesn't
405work with objects or strings, which might be a bug we haven't fixed yet.
406
407 $got = readline($fd[0]);
408
409Let it be noted that the flakiness of indirect filehandles is not
410related to whether they're strings, typeglobs, objects, or anything else.
411It's the syntax of the fundamental operators. Playing the object
412game doesn't help you at all here.
46fc3d4c 413
68dc0745 414=head2 How can I set up a footer format to be used with write()?
415
54310121 416There's no builtin way to do this, but L<perlform> has a couple of
68dc0745 417techniques to make it possible for the intrepid hacker.
418
419=head2 How can I write() into a string?
420
421See L<perlform> for an swrite() function.
422
423=head2 How can I output my numbers with commas added?
424
425This one will do it for you:
426
427 sub commify {
428 local $_ = shift;
429 1 while s/^(-?\d+)(\d{3})/$1,$2/;
430 return $_;
431 }
432
433 $n = 23659019423.2331;
434 print "GOT: ", commify($n), "\n";
435
436 GOT: 23,659,019,423.2331
437
438You can't just:
439
440 s/^(-?\d+)(\d{3})/$1,$2/g;
441
442because you have to put the comma in and then recalculate your
443position.
444
46fc3d4c 445Alternatively, this commifies all numbers in a line regardless of
446whether they have decimal portions, are preceded by + or -, or
447whatever:
448
449 # from Andrew Johnson <ajohnson@gpu.srv.ualberta.ca>
450 sub commify {
451 my $input = shift;
452 $input = reverse $input;
453 $input =~ s<(\d\d\d)(?=\d)(?!\d*\.)><$1,>g;
454 return reverse $input;
455 }
456
68dc0745 457=head2 How can I translate tildes (~) in a filename?
458
459Use the E<lt>E<gt> (glob()) operator, documented in L<perlfunc>. This
460requires that you have a shell installed that groks tildes, meaning
461csh or tcsh or (some versions of) ksh, and thus may have portability
462problems. The Glob::KGlob module (available from CPAN) gives more
463portable glob functionality.
464
465Within Perl, you may use this directly:
466
467 $filename =~ s{
468 ^ ~ # find a leading tilde
469 ( # save this in $1
470 [^/] # a non-slash character
471 * # repeated 0 or more times (0 means me)
472 )
473 }{
474 $1
475 ? (getpwnam($1))[7]
476 : ( $ENV{HOME} || $ENV{LOGDIR} )
477 }ex;
478
5a964f20 479=head2 How come when I open a file read-write it wipes it out?
68dc0745 480
481Because you're using something like this, which truncates the file and
482I<then> gives you read-write access:
483
5a964f20 484 open(FH, "+> /path/name"); # WRONG (almost always)
68dc0745 485
486Whoops. You should instead use this, which will fail if the file
5a964f20
TC
487doesn't exist. Using "E<gt>" always clobbers or creates.
488Using "E<lt>" never does either. The "+" doesn't change this.
68dc0745 489
5a964f20
TC
490Here are examples of many kinds of file opens. Those using sysopen()
491all assume
68dc0745 492
5a964f20 493 use Fcntl;
68dc0745 494
5a964f20 495To open file for reading:
68dc0745 496
5a964f20
TC
497 open(FH, "< $path") || die $!;
498 sysopen(FH, $path, O_RDONLY) || die $!;
499
500To open file for writing, create new file if needed or else truncate old file:
501
502 open(FH, "> $path") || die $!;
503 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT) || die $!;
504 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666) || die $!;
505
506To open file for writing, create new file, file must not exist:
507
508 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT) || die $!;
509 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT, 0666) || die $!;
510
511To open file for appending, create if necessary:
512
513 open(FH, ">> $path") || die $!;
514 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT) || die $!;
515 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT, 0666) || die $!;
516
517To open file for appending, file must exist:
518
519 sysopen(FH, $path, O_WRONLY|O_APPEND) || die $!;
520
521To open file for update, file must exist:
522
523 open(FH, "+< $path") || die $!;
524 sysopen(FH, $path, O_RDWR) || die $!;
525
526To open file for update, create file if necessary:
527
528 sysopen(FH, $path, O_RDWR|O_CREAT) || die $!;
529 sysopen(FH, $path, O_RDWR|O_CREAT, 0666) || die $!;
530
531To open file for update, file must not exist:
532
533 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT) || die $!;
534 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT, 0666) || die $!;
535
536To open a file without blocking, creating if necessary:
537
538 sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT)
539 or die "can't open /tmp/somefile: $!":
540
541Be warned that neither creation nor deletion of files is guaranteed to
542be an atomic operation over NFS. That is, two processes might both
543successful create or unlink the same file! Therefore O_EXCL
544isn't so exclusive as you might wish.
68dc0745 545
546=head2 Why do I sometimes get an "Argument list too long" when I use <*>?
547
548The C<E<lt>E<gt>> operator performs a globbing operation (see above).
549By default glob() forks csh(1) to do the actual glob expansion, but
550csh can't handle more than 127 items and so gives the error message
551C<Argument list too long>. People who installed tcsh as csh won't
552have this problem, but their users may be surprised by it.
553
554To get around this, either do the glob yourself with C<Dirhandle>s and
555patterns, or use a module like Glob::KGlob, one that doesn't use the
556shell to do globbing.
557
558=head2 Is there a leak/bug in glob()?
559
560Due to the current implementation on some operating systems, when you
561use the glob() function or its angle-bracket alias in a scalar
562context, you may cause a leak and/or unpredictable behavior. It's
563best therefore to use glob() only in list context.
564
565=head2 How can I open a file with a leading "E<gt>" or trailing blanks?
566
567Normally perl ignores trailing blanks in filenames, and interprets
568certain leading characters (or a trailing "|") to mean something
569special. To avoid this, you might want to use a routine like this.
570It makes incomplete pathnames into explicit relative ones, and tacks a
571trailing null byte on the name to make perl leave it alone:
572
573 sub safe_filename {
574 local $_ = shift;
575 return m#^/#
576 ? "$_\0"
577 : "./$_\0";
578 }
579
580 $fn = safe_filename("<<<something really wicked ");
581 open(FH, "> $fn") or "couldn't open $fn: $!";
582
583You could also use the sysopen() function (see L<perlfunc/sysopen>).
584
585=head2 How can I reliably rename a file?
586
587Well, usually you just use Perl's rename() function. But that may
588not work everywhere, in particular, renaming files across file systems.
589If your operating system supports a mv(1) program or its moral equivalent,
590this works:
591
592 rename($old, $new) or system("mv", $old, $new);
593
594It may be more compelling to use the File::Copy module instead. You
595just copy to the new file to the new name (checking return values),
596then delete the old one. This isn't really the same semantics as a
597real rename(), though, which preserves metainformation like
598permissions, timestamps, inode info, etc.
599
5a964f20
TC
600The newer version of File::Copy export a move() function.
601
68dc0745 602=head2 How can I lock a file?
603
54310121 604Perl's builtin flock() function (see L<perlfunc> for details) will call
68dc0745 605flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and
606later), and lockf(3) if neither of the two previous system calls exists.
607On some systems, it may even use a different form of native locking.
608Here are some gotchas with Perl's flock():
609
610=over 4
611
612=item 1
613
614Produces a fatal error if none of the three system calls (or their
615close equivalent) exists.
616
617=item 2
618
619lockf(3) does not provide shared locking, and requires that the
620filehandle be open for writing (or appending, or read/writing).
621
622=item 3
623
624Some versions of flock() can't lock files over a network (e.g. on NFS
625file systems), so you'd need to force the use of fcntl(2) when you
626build Perl. See the flock entry of L<perlfunc>, and the F<INSTALL>
627file in the source distribution for information on building Perl to do
628this.
629
630=back
631
68dc0745 632=head2 What can't I just open(FH, ">file.lock")?
633
634A common bit of code B<NOT TO USE> is this:
635
636 sleep(3) while -e "file.lock"; # PLEASE DO NOT USE
637 open(LCK, "> file.lock"); # THIS BROKEN CODE
638
639This is a classic race condition: you take two steps to do something
640which must be done in one. That's why computer hardware provides an
641atomic test-and-set instruction. In theory, this "ought" to work:
642
5a964f20 643 sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT)
68dc0745 644 or die "can't open file.lock: $!":
645
646except that lamentably, file creation (and deletion) is not atomic
647over NFS, so this won't work (at least, not every time) over the net.
46fc3d4c 648Various schemes involving involving link() have been suggested, but
649these tend to involve busy-wait, which is also subdesirable.
68dc0745 650
fc36a67e 651=head2 I still don't get locking. I just want to increment the number in the file. How can I do this?
68dc0745 652
46fc3d4c 653Didn't anyone ever tell you web-page hit counters were useless?
5a964f20
TC
654They don't count number of hits, they're a waste of time, and they serve
655only to stroke the writer's vanity. Better to pick a random number.
656It's more realistic.
68dc0745 657
5a964f20 658Anyway, this is what you can do if you can't help yourself.
68dc0745 659
660 use Fcntl;
5a964f20 661 sysopen(FH, "numfile", O_RDWR|O_CREAT) or die "can't open numfile: $!";
68dc0745 662 flock(FH, 2) or die "can't flock numfile: $!";
663 $num = <FH> || 0;
664 seek(FH, 0, 0) or die "can't rewind numfile: $!";
665 truncate(FH, 0) or die "can't truncate numfile: $!";
666 (print FH $num+1, "\n") or die "can't write numfile: $!";
667 # DO NOT UNLOCK THIS UNTIL YOU CLOSE
668 close FH or die "can't close numfile: $!";
669
46fc3d4c 670Here's a much better web-page hit counter:
68dc0745 671
672 $hits = int( (time() - 850_000_000) / rand(1_000) );
673
674If the count doesn't impress your friends, then the code might. :-)
675
676=head2 How do I randomly update a binary file?
677
678If you're just trying to patch a binary, in many cases something as
679simple as this works:
680
681 perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs
682
683However, if you have fixed sized records, then you might do something more
684like this:
685
686 $RECSIZE = 220; # size of record, in bytes
687 $recno = 37; # which record to update
688 open(FH, "+<somewhere") || die "can't update somewhere: $!";
689 seek(FH, $recno * $RECSIZE, 0);
690 read(FH, $record, $RECSIZE) == $RECSIZE || die "can't read record $recno: $!";
691 # munge the record
692 seek(FH, $recno * $RECSIZE, 0);
693 print FH $record;
694 close FH;
695
696Locking and error checking are left as an exercise for the reader.
697Don't forget them, or you'll be quite sorry.
698
68dc0745 699=head2 How do I get a file's timestamp in perl?
700
701If you want to retrieve the time at which the file was last read,
46fc3d4c 702written, or had its meta-data (owner, etc) changed, you use the B<-M>,
68dc0745 703B<-A>, or B<-C> filetest operations as documented in L<perlfunc>. These
704retrieve the age of the file (measured against the start-time of your
705program) in days as a floating point number. To retrieve the "raw"
706time in seconds since the epoch, you would call the stat function,
707then use localtime(), gmtime(), or POSIX::strftime() to convert this
708into human-readable form.
709
710Here's an example:
711
712 $write_secs = (stat($file))[9];
713 print "file $file updated at ", scalar(localtime($file)), "\n";
714
715If you prefer something more legible, use the File::stat module
716(part of the standard distribution in version 5.004 and later):
717
718 use File::stat;
719 use Time::localtime;
720 $date_string = ctime(stat($file)->mtime);
721 print "file $file updated at $date_string\n";
722
723Error checking is left as an exercise for the reader.
724
725=head2 How do I set a file's timestamp in perl?
726
727You use the utime() function documented in L<perlfunc/utime>.
728By way of example, here's a little program that copies the
729read and write times from its first argument to all the rest
730of them.
731
732 if (@ARGV < 2) {
733 die "usage: cptimes timestamp_file other_files ...\n";
734 }
735 $timestamp = shift;
736 ($atime, $mtime) = (stat($timestamp))[8,9];
737 utime $atime, $mtime, @ARGV;
738
739Error checking is left as an exercise for the reader.
740
741Note that utime() currently doesn't work correctly with Win95/NT
742ports. A bug has been reported. Check it carefully before using
743it on those platforms.
744
745=head2 How do I print to more than one file at once?
746
747If you only have to do this once, you can do this:
748
749 for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
750
751To connect up to one filehandle to several output filehandles, it's
752easiest to use the tee(1) program if you have it, and let it take care
753of the multiplexing:
754
755 open (FH, "| tee file1 file2 file3");
756
5a964f20
TC
757Or even:
758
759 # make STDOUT go to three files, plus original STDOUT
760 open (STDOUT, "| tee file1 file2 file3") or die "Teeing off: $!\n";
761 print "whatever\n" or die "Writing: $!\n";
762 close(STDOUT) or die "Closing: $!\n";
68dc0745 763
5a964f20
TC
764Otherwise you'll have to write your own multiplexing print
765function -- or your own tee program -- or use Tom Christiansen's,
766at http://www.perl.com/CPAN/authors/id/TOMC/scripts/tct.gz, which is
767written in Perl and offers much greater functionality
768than the stock version.
68dc0745 769
770=head2 How can I read in a file by paragraphs?
771
772Use the C<$\> variable (see L<perlvar> for details). You can either
773set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">,
774for instance, gets treated as two paragraphs and not three), or
775C<"\n\n"> to accept empty paragraphs.
776
777=head2 How can I read a single character from a file? From the keyboard?
778
779You can use the builtin C<getc()> function for most filehandles, but
780it won't (easily) work on a terminal device. For STDIN, either use
781the Term::ReadKey module from CPAN, or use the sample code in
782L<perlfunc/getc>.
783
784If your system supports POSIX, you can use the following code, which
785you'll note turns off echo processing as well.
786
787 #!/usr/bin/perl -w
788 use strict;
789 $| = 1;
790 for (1..4) {
791 my $got;
792 print "gimme: ";
793 $got = getone();
794 print "--> $got\n";
795 }
796 exit;
797
798 BEGIN {
799 use POSIX qw(:termios_h);
800
801 my ($term, $oterm, $echo, $noecho, $fd_stdin);
802
803 $fd_stdin = fileno(STDIN);
804
805 $term = POSIX::Termios->new();
806 $term->getattr($fd_stdin);
807 $oterm = $term->getlflag();
808
809 $echo = ECHO | ECHOK | ICANON;
810 $noecho = $oterm & ~$echo;
811
812 sub cbreak {
813 $term->setlflag($noecho);
814 $term->setcc(VTIME, 1);
815 $term->setattr($fd_stdin, TCSANOW);
816 }
817
818 sub cooked {
819 $term->setlflag($oterm);
820 $term->setcc(VTIME, 0);
821 $term->setattr($fd_stdin, TCSANOW);
822 }
823
824 sub getone {
825 my $key = '';
826 cbreak();
827 sysread(STDIN, $key, 1);
828 cooked();
829 return $key;
830 }
831
832 }
833
834 END { cooked() }
835
836The Term::ReadKey module from CPAN may be easier to use:
837
838 use Term::ReadKey;
839 open(TTY, "</dev/tty");
840 print "Gimme a char: ";
841 ReadMode "raw";
842 $key = ReadKey 0, *TTY;
843 ReadMode "normal";
844 printf "\nYou said %s, char number %03d\n",
845 $key, ord $key;
846
46fc3d4c 847For DOS systems, Dan Carson <dbc@tc.fluke.COM> reports the following:
68dc0745 848
849To put the PC in "raw" mode, use ioctl with some magic numbers gleaned
850from msdos.c (Perl source file) and Ralf Brown's interrupt list (comes
851across the net every so often):
852
853 $old_ioctl = ioctl(STDIN,0,0); # Gets device info
854 $old_ioctl &= 0xff;
855 ioctl(STDIN,1,$old_ioctl | 32); # Writes it back, setting bit 5
856
857Then to read a single character:
858
859 sysread(STDIN,$c,1); # Read a single character
860
861And to put the PC back to "cooked" mode:
862
863 ioctl(STDIN,1,$old_ioctl); # Sets it back to cooked mode.
864
865So now you have $c. If C<ord($c) == 0>, you have a two byte code, which
866means you hit a special key. Read another byte with C<sysread(STDIN,$c,1)>,
867and that value tells you what combination it was according to this
868table:
869
870 # PC 2-byte keycodes = ^@ + the following:
871
872 # HEX KEYS
873 # --- ----
874 # 0F SHF TAB
875 # 10-19 ALT QWERTYUIOP
876 # 1E-26 ALT ASDFGHJKL
877 # 2C-32 ALT ZXCVBNM
878 # 3B-44 F1-F10
879 # 47-49 HOME,UP,PgUp
880 # 4B LEFT
881 # 4D RIGHT
882 # 4F-53 END,DOWN,PgDn,Ins,Del
883 # 54-5D SHF F1-F10
884 # 5E-67 CTR F1-F10
885 # 68-71 ALT F1-F10
886 # 73-77 CTR LEFT,RIGHT,END,PgDn,HOME
887 # 78-83 ALT 1234567890-=
888 # 84 CTR PgUp
889
890This is all trial and error I did a long time ago, I hope I'm reading the
891file that worked.
892
893=head2 How can I tell if there's a character waiting on a filehandle?
894
5a964f20
TC
895The very first thing you should do is look into getting the Term::ReadKey
896extension from CPAN. It now even has limited support for closed, proprietary
897(read: not open systems, not POSIX, not Unix, etc) systems.
898
899You should also check out the Frequently Asked Questions list in
68dc0745 900comp.unix.* for things like this: the answer is essentially the same.
901It's very system dependent. Here's one solution that works on BSD
902systems:
903
904 sub key_ready {
905 my($rin, $nfd);
906 vec($rin, fileno(STDIN), 1) = 1;
907 return $nfd = select($rin,undef,undef,0);
908 }
909
5a964f20
TC
910If you want to find out how many characters are waiting,
911there's also the FIONREAD ioctl call to be looked at.
68dc0745 912
5a964f20
TC
913The I<h2ph> tool that comes with Perl tries to convert C include
914files to Perl code, which can be C<require>d. FIONREAD ends
915up defined as a function in the I<sys/ioctl.ph> file:
68dc0745 916
5a964f20 917 require 'sys/ioctl.ph';
68dc0745 918
5a964f20
TC
919 $size = pack("L", 0);
920 ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n";
921 $size = unpack("L", $size);
68dc0745 922
5a964f20
TC
923If I<h2ph> wasn't installed or doesn't work for you, you can
924I<grep> the include files by hand:
68dc0745 925
5a964f20
TC
926 % grep FIONREAD /usr/include/*/*
927 /usr/include/asm/ioctls.h:#define FIONREAD 0x541B
68dc0745 928
5a964f20 929Or write a small C program using the editor of champions:
68dc0745 930
5a964f20
TC
931 % cat > fionread.c
932 #include <sys/ioctl.h>
933 main() {
934 printf("%#08x\n", FIONREAD);
935 }
936 ^D
937 % cc -o fionread fionread
938 % ./fionread
939 0x4004667f
940
941And then hard-code it, leaving porting as an exercise to your successor.
942
943 $FIONREAD = 0x4004667f; # XXX: opsys dependent
944
945 $size = pack("L", 0);
946 ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n";
947 $size = unpack("L", $size);
948
949FIONREAD requires a filehandle connected to a stream, meaning sockets,
950pipes, and tty devices work, but I<not> files.
68dc0745 951
952=head2 How do I do a C<tail -f> in perl?
953
954First try
955
956 seek(GWFILE, 0, 1);
957
958The statement C<seek(GWFILE, 0, 1)> doesn't change the current position,
959but it does clear the end-of-file condition on the handle, so that the
960next <GWFILE> makes Perl try again to read something.
961
962If that doesn't work (it relies on features of your stdio implementation),
963then you need something more like this:
964
965 for (;;) {
966 for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) {
967 # search for some stuff and put it into files
968 }
969 # sleep for a while
970 seek(GWFILE, $curpos, 0); # seek to where we had been
971 }
972
973If this still doesn't work, look into the POSIX module. POSIX defines
974the clearerr() method, which can remove the end of file condition on a
975filehandle. The method: read until end of file, clearerr(), read some
976more. Lather, rinse, repeat.
977
978=head2 How do I dup() a filehandle in Perl?
979
980If you check L<perlfunc/open>, you'll see that several of the ways
981to call open() should do the trick. For example:
982
983 open(LOG, ">>/tmp/logfile");
984 open(STDERR, ">&LOG");
985
986Or even with a literal numeric descriptor:
987
988 $fd = $ENV{MHCONTEXTFD};
989 open(MHCONTEXT, "<&=$fd"); # like fdopen(3S)
990
5a964f20
TC
991Note that "E<lt>&STDIN" makes a copy, but "E<lt>&=STDIN" make
992an alias. That means if you close an aliased handle, all
993aliases become inaccessible. This is not true with
994a copied one.
995
996Error checking, as always, has been left as an exercise for the reader.
68dc0745 997
998=head2 How do I close a file descriptor by number?
999
1000This should rarely be necessary, as the Perl close() function is to be
1001used for things that Perl opened itself, even if it was a dup of a
1002numeric descriptor, as with MHCONTEXT above. But if you really have
1003to, you may be able to do this:
1004
1005 require 'sys/syscall.ph';
1006 $rc = syscall(&SYS_close, $fd + 0); # must force numeric
1007 die "can't sysclose $fd: $!" unless $rc == -1;
1008
46fc3d4c 1009=head2 Why can't I use "C:\temp\foo" in DOS paths? What doesn't `C:\temp\foo.exe` work?
68dc0745 1010
1011Whoops! You just put a tab and a formfeed into that filename!
1012Remember that within double quoted strings ("like\this"), the
1013backslash is an escape character. The full list of these is in
1014L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't
1015have a file called "c:(tab)emp(formfeed)oo" or
46fc3d4c 1016"c:(tab)emp(formfeed)oo.exe" on your DOS filesystem.
68dc0745 1017
1018Either single-quote your strings, or (preferably) use forward slashes.
46fc3d4c 1019Since all DOS and Windows versions since something like MS-DOS 2.0 or so
68dc0745 1020have treated C</> and C<\> the same in a path, you might as well use the
1021one that doesn't clash with Perl -- or the POSIX shell, ANSI C and C++,
1022awk, Tcl, Java, or Python, just to mention a few.
1023
1024=head2 Why doesn't glob("*.*") get all the files?
1025
1026Because even on non-Unix ports, Perl's glob function follows standard
46fc3d4c 1027Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden)
5a964f20 1028files. This makes glob() portable.
68dc0745 1029
1030=head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl?
1031
1032This is elaborately and painstakingly described in the "Far More Than
7b8d334a 1033You Ever Wanted To Know" in
68dc0745 1034http://www.perl.com/CPAN/doc/FMTEYEWTK/file-dir-perms .
1035
1036The executive summary: learn how your filesystem works. The
1037permissions on a file say what can happen to the data in that file.
1038The permissions on a directory say what can happen to the list of
1039files in that directory. If you delete a file, you're removing its
1040name from the directory (so the operation depends on the permissions
1041of the directory, not of the file). If you try to write to the file,
1042the permissions of the file govern whether you're allowed to.
1043
1044=head2 How do I select a random line from a file?
1045
1046Here's an algorithm from the Camel Book:
1047
1048 srand;
1049 rand($.) < 1 && ($line = $_) while <>;
1050
1051This has a significant advantage in space over reading the whole
5a964f20
TC
1052file in. A simple proof by induction is available upon
1053request if you doubt its correctness.
68dc0745 1054
1055=head1 AUTHOR AND COPYRIGHT
1056
5a964f20
TC
1057Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington.
1058All rights reserved.
1059
1060When included as part of the Standard Version of Perl, or as part of
1061its complete documentation whether printed or otherwise, this work
1062may be distributed only under the terms of Perl's Artistic License.
1063Any distribution of this file or derivatives thereof I<outside>
1064of that package require that special arrangements be made with
1065copyright holder.
1066
1067Irrespective of its distribution, all code examples in this file
1068are hereby placed into the public domain. You are permitted and
1069encouraged to use this code in your own programs for fun
1070or for profit as you see fit. A simple comment in the code giving
1071credit would be courteous but is not required.