perl5.git.perl.org Git - perl5.git/blame_incremental

... / ...

Commit	Line	Data
	1	=head1 NAME
	2
	3	perlopentut - tutorial on opening things in Perl
	4
	5	=head1 DESCRIPTION
	6
	7	Perl has two simple, built-in ways to open files: the shell way for
	8	convenience, and the C way for precision. The shell way also has 2- and
	9	3-argument forms, which have different semantics for handling the filename.
	10	The choice is yours.
	11
	12	=head1 Open E<agrave> la shell
	13
	14	Perl's C<open> function was designed to mimic the way command-line
	15	redirection in the shell works. Here are some basic examples
	16	from the shell:
	17
	18	$ myprogram file1 file2 file3
	19	$ myprogram < inputfile
	20	$ myprogram > outputfile
	21	$ myprogram >> outputfile
	22	$ myprogram \| otherprogram
	23	$ otherprogram \| myprogram
	24
	25	And here are some more advanced examples:
	26
	27	$ otherprogram \| myprogram f1 - f2
	28	$ otherprogram 2>&1 \| myprogram -
	29	$ myprogram <&3
	30	$ myprogram >&4
	31
	32	Programmers accustomed to constructs like those above can take comfort
	33	in learning that Perl directly supports these familiar constructs using
	34	virtually the same syntax as the shell.
	35
	36	=head2 Simple Opens
	37
	38	The C<open> function takes two arguments: the first is a filehandle,
	39	and the second is a single string comprising both what to open and how
	40	to open it. C<open> returns true when it works, and when it fails,
	41	returns a false value and sets the special variable C<$!> to reflect
	42	the system error. If the filehandle was previously opened, it will
	43	be implicitly closed first.
	44
	45	For example:
	46
	47	open(INFO, "datafile") \|\| die("can't open datafile: $!");
	48	open(INFO, "< datafile") \|\| die("can't open datafile: $!");
	49	open(RESULTS,"> runstats") \|\| die("can't open runstats: $!");
	50	open(LOG, ">> logfile ") \|\| die("can't open logfile: $!");
	51
	52	If you prefer the low-punctuation version, you could write that this way:
	53
	54	open INFO, "< datafile" or die "can't open datafile: $!";
	55	open RESULTS,"> runstats" or die "can't open runstats: $!";
	56	open LOG, ">> logfile " or die "can't open logfile: $!";
	57
	58	A few things to notice. First, the leading C<< < >> is optional.
	59	If omitted, Perl assumes that you want to open the file for reading.
	60
	61	Note also that the first example uses the C<\|\|> logical operator, and the
	62	second uses C<or>, which has lower precedence. Using C<\|\|> in the latter
	63	examples would effectively mean
	64
	65	open INFO, ( "< datafile" \|\| die "can't open datafile: $!" );
	66
	67	which is definitely not what you want.
	68
	69	The other important thing to notice is that, just as in the shell,
	70	any whitespace before or after the filename is ignored. This is good,
	71	because you wouldn't want these to do different things:
	72
	73	open INFO, "<datafile"
	74	open INFO, "< datafile"
	75	open INFO, "< datafile"
	76
	77	Ignoring surrounding whitespace also helps for when you read a filename
	78	in from a different file, and forget to trim it before opening:
	79
	80	$filename = <INFO>; # oops, \n still there
	81	open(EXTRA, "< $filename") \|\| die "can't open $filename: $!";
	82
	83	This is not a bug, but a feature. Because C<open> mimics the shell in
	84	its style of using redirection arrows to specify how to open the file, it
	85	also does so with respect to extra whitespace around the filename itself
	86	as well. For accessing files with naughty names, see
	87	L<"Dispelling the Dweomer">.
	88
	89	There is also a 3-argument version of C<open>, which lets you put the
	90	special redirection characters into their own argument:
	91
	92	open( INFO, ">", $datafile ) \|\| die "Can't create $datafile: $!";
	93
	94	In this case, the filename to open is the actual string in C<$datafile>,
	95	so you don't have to worry about C<$datafile> containing characters
	96	that might influence the open mode, or whitespace at the beginning of
	97	the filename that would be absorbed in the 2-argument version. Also,
	98	any reduction of unnecessary string interpolation is a good thing.
	99
	100	=head2 Indirect Filehandles
	101
	102	C<open>'s first argument can be a reference to a filehandle. As of
	103	perl 5.6.0, if the argument is uninitialized, Perl will automatically
	104	create a filehandle and put a reference to it in the first argument,
	105	like so:
	106
	107	open( my $in, $infile ) or die "Couldn't read $infile: $!";
	108	while ( <$in> ) {
	109	# do something with $_
	110	}
	111	close $in;
	112
	113	Indirect filehandles make namespace management easier. Since filehandles
	114	are global to the current package, two subroutines trying to open
	115	C<INFILE> will clash. With two functions opening indirect filehandles
	116	like C<my $infile>, there's no clash and no need to worry about future
	117	conflicts.
	118
	119	Another convenient behavior is that an indirect filehandle automatically
	120	closes when there are no more references to it:
	121
	122	sub firstline {
	123	open( my $in, shift ) && return scalar <$in>;
	124	# no close() required
	125	}
	126
	127	Indirect filehandles also make it easy to pass filehandles to and return
	128	filehandles from subroutines:
	129
	130	for my $file ( qw(this.conf that.conf) ) {
	131	my $fin = open_or_throw('<', $file);
	132	process_conf( $fin );
	133	# no close() needed
	134	}
	135
	136	use Carp;
	137	sub open_or_throw {
	138	my ($mode, $filename) = @_;
	139	open my $h, $mode, $filename
	140	or croak "Could not open '$filename': $!";
	141	return $h;
	142	}
	143
	144	=head2 Pipe Opens
	145
	146	In C, when you want to open a file using the standard I/O library,
	147	you use the C<fopen> function, but when opening a pipe, you use the
	148	C<popen> function. But in the shell, you just use a different redirection
	149	character. That's also the case for Perl. The C<open> call
	150	remains the same--just its argument differs.
	151
	152	If the leading character is a pipe symbol, C<open> starts up a new
	153	command and opens a write-only filehandle leading into that command.
	154	This lets you write into that handle and have what you write show up on
	155	that command's standard input. For example:
	156
	157	open(PRINTER, "\| lpr -Plp1") \|\| die "can't run lpr: $!";
	158	print PRINTER "stuff\n";
	159	close(PRINTER) \|\| die "can't close lpr: $!";
	160
	161	If the trailing character is a pipe, you start up a new command and open a
	162	read-only filehandle leading out of that command. This lets whatever that
	163	command writes to its standard output show up on your handle for reading.
	164	For example:
	165
	166	open(NET, "netstat -i -n \|") \|\| die "can't fork netstat: $!";
	167	while (<NET>) { } # do something with input
	168	close(NET) \|\| die "can't close netstat: $!";
	169
	170	What happens if you try to open a pipe to or from a non-existent
	171	command? If possible, Perl will detect the failure and set C<$!> as
	172	usual. But if the command contains special shell characters, such as
	173	C<E<gt>> or C<*>, called 'metacharacters', Perl does not execute the
	174	command directly. Instead, Perl runs the shell, which then tries to
	175	run the command. This means that it's the shell that gets the error
	176	indication. In such a case, the C<open> call will only indicate
	177	failure if Perl can't even run the shell. See L<perlfaq8/"How can I
	178	capture STDERR from an external command?"> to see how to cope with
	179	this. There's also an explanation in L<perlipc>.
	180
	181	If you would like to open a bidirectional pipe, the IPC::Open2
	182	library will handle this for you. Check out
	183	L<perlipc/"Bidirectional Communication with Another Process">
	184
	185	perl-5.6.x introduced a version of piped open that executes a process
	186	based on its command line arguments without relying on the shell. (Similar
	187	to the C<system(@LIST)> notation.) This is safer and faster than executing
	188	a single argument pipe-command, but does not allow special shell
	189	constructs. (It is also not supported on Microsoft Windows, Mac OS Classic
	190	or RISC OS.)
	191
	192	Here's an example of C<open '-\|'>, which prints a random Unix
	193	fortune cookie as uppercase:
	194
	195	my $collection = shift(@ARGV);
	196	open my $fortune, '-\|', 'fortune', $collection
	197	or die "Could not find fortune - $!";
	198	while (<$fortune>)
	199	{
	200	print uc($_);
	201	}
	202	close($fortune);
	203
	204	And this C<open '\|-'> pipes into lpr:
	205
	206	open my $printer, '\|-', 'lpr', '-Plp1'
	207	or die "can't run lpr: $!";
	208	print {$printer} "stuff\n";
	209	close($printer)
	210	or die "can't close lpr: $!";
	211
	212	=head2 The Minus File
	213
	214	Again following the lead of the standard shell utilities, Perl's
	215	C<open> function treats a file whose name is a single minus, "-", in a
	216	special way. If you open minus for reading, it really means to access
	217	the standard input. If you open minus for writing, it really means to
	218	access the standard output.
	219
	220	If minus can be used as the default input or default output, what happens
	221	if you open a pipe into or out of minus? What's the default command it
	222	would run? The same script as you're currently running! This is actually
	223	a stealth C<fork> hidden inside an C<open> call. See
	224	L<perlipc/"Safe Pipe Opens"> for details.
	225
	226	=head2 Mixing Reads and Writes
	227
	228	It is possible to specify both read and write access. All you do is
	229	add a "+" symbol in front of the redirection. But as in the shell,
	230	using a less-than on a file never creates a new file; it only opens an
	231	existing one. On the other hand, using a greater-than always clobbers
	232	(truncates to zero length) an existing file, or creates a brand-new one
	233	if there isn't an old one. Adding a "+" for read-write doesn't affect
	234	whether it only works on existing files or always clobbers existing ones.
	235
	236	open(WTMP, "+< /usr/adm/wtmp")
	237	\|\| die "can't open /usr/adm/wtmp: $!";
	238
	239	open(SCREEN, "+> lkscreen")
	240	\|\| die "can't open lkscreen: $!";
	241
	242	open(LOGFILE, "+>> /var/log/applog")
	243	\|\| die "can't open /var/log/applog: $!";
	244
	245	The first one won't create a new file, and the second one will always
	246	clobber an old one. The third one will create a new file if necessary
	247	and not clobber an old one, and it will allow you to read at any point
	248	in the file, but all writes will always go to the end. In short,
	249	the first case is substantially more common than the second and third
	250	cases, which are almost always wrong. (If you know C, the plus in
	251	Perl's C<open> is historically derived from the one in C's fopen(3S),
	252	which it ultimately calls.)
	253
	254	In fact, when it comes to updating a file, unless you're working on
	255	a binary file as in the WTMP case above, you probably don't want to
	256	use this approach for updating. Instead, Perl's B<-i> flag comes to
	257	the rescue. The following command takes all the C, C++, or yacc source
	258	or header files and changes all their foo's to bar's, leaving
	259	the old version in the original filename with a ".orig" tacked
	260	on the end:
	261
	262	$ perl -i.orig -pe 's/\bfoo\b/bar/g' *.[Cchy]
	263
	264	This is a short cut for some renaming games that are really
	265	the best way to update textfiles. See the second question in
	266	L<perlfaq5> for more details.
	267
	268	=head2 Filters
	269
	270	One of the most common uses for C<open> is one you never
	271	even notice. When you process the ARGV filehandle using
	272	C<< <ARGV> >>, Perl actually does an implicit open
	273	on each file in @ARGV. Thus a program called like this:
	274
	275	$ myprogram file1 file2 file3
	276
	277	can have all its files opened and processed one at a time
	278	using a construct no more complex than:
	279
	280	while (<>) {
	281	# do something with $_
	282	}
	283
	284	If @ARGV is empty when the loop first begins, Perl pretends you've opened
	285	up minus, that is, the standard input. In fact, $ARGV, the currently
	286	open file during C<< <ARGV> >> processing, is even set to "-"
	287	in these circumstances.
	288
	289	You are welcome to pre-process your @ARGV before starting the loop to
	290	make sure it's to your liking. One reason to do this might be to remove
	291	command options beginning with a minus. While you can always roll the
	292	simple ones by hand, the Getopts modules are good for this:
	293
	294	use Getopt::Std;
	295
	296	# -v, -D, -o ARG, sets $opt_v, $opt_D, $opt_o
	297	getopts("vDo:");
	298
	299	# -v, -D, -o ARG, sets $args{v}, $args{D}, $args{o}
	300	getopts("vDo:", \%args);
	301
	302	Or the standard Getopt::Long module to permit named arguments:
	303
	304	use Getopt::Long;
	305	GetOptions( "verbose" => \$verbose, # --verbose
	306	"Debug" => \$debug, # --Debug
	307	"output=s" => \$output );
	308	# --output=somestring or --output somestring
	309
	310	Another reason for preprocessing arguments is to make an empty
	311	argument list default to all files:
	312
	313	@ARGV = glob("*") unless @ARGV;
	314
	315	You could even filter out all but plain, text files. This is a bit
	316	silent, of course, and you might prefer to mention them on the way.
	317
	318	@ARGV = grep { -f && -T } @ARGV;
	319
	320	If you're using the B<-n> or B<-p> command-line options, you
	321	should put changes to @ARGV in a C<BEGIN{}> block.
	322
	323	Remember that a normal C<open> has special properties, in that it might
	324	call fopen(3S) or it might called popen(3S), depending on what its
	325	argument looks like; that's why it's sometimes called "magic open".
	326	Here's an example:
	327
	328	$pwdinfo = `domainname` =~ /^($none$)?$/
	329	? '< /etc/passwd'
	330	: 'ypcat passwd \|';
	331
	332	open(PWD, $pwdinfo)
	333	or die "can't open $pwdinfo: $!";
	334
	335	This sort of thing also comes into play in filter processing. Because
	336	C<< <ARGV> >> processing employs the normal, shell-style Perl C<open>,
	337	it respects all the special things we've already seen:
	338
	339	$ myprogram f1 "cmd1\|" - f2 "cmd2\|" f3 < tmpfile
	340
	341	That program will read from the file F<f1>, the process F<cmd1>, standard
	342	input (F<tmpfile> in this case), the F<f2> file, the F<cmd2> command,
	343	and finally the F<f3> file.
	344
	345	Yes, this also means that if you have files named "-" (and so on) in
	346	your directory, they won't be processed as literal files by C<open>.
	347	You'll need to pass them as "./-", much as you would for the I<rm> program,
	348	or you could use C<sysopen> as described below.
	349
	350	One of the more interesting applications is to change files of a certain
	351	name into pipes. For example, to autoprocess gzipped or compressed
	352	files by decompressing them with I<gzip>:
	353
	354	@ARGV = map { /\.(gz\|Z)$/ ? "gzip -dc $_ \|" : $_ } @ARGV;
	355
	356	Or, if you have the I<GET> program installed from LWP,
	357	you can fetch URLs before processing them:
	358
	359	@ARGV = map { m#^\w+://# ? "GET $_ \|" : $_ } @ARGV;
	360
	361	It's not for nothing that this is called magic C<< <ARGV> >>.
	362	Pretty nifty, eh?
	363
	364	=head1 Open E<agrave> la C
	365
	366	If you want the convenience of the shell, then Perl's C<open> is
	367	definitely the way to go. On the other hand, if you want finer precision
	368	than C's simplistic fopen(3S) provides you should look to Perl's
	369	C<sysopen>, which is a direct hook into the open(2) system call.
	370	That does mean it's a bit more involved, but that's the price of
	371	precision.
	372
	373	C<sysopen> takes 3 (or 4) arguments.
	374
	375	sysopen HANDLE, PATH, FLAGS, [MASK]
	376
	377	The HANDLE argument is a filehandle just as with C<open>. The PATH is
	378	a literal path, one that doesn't pay attention to any greater-thans or
	379	less-thans or pipes or minuses, nor ignore whitespace. If it's there,
	380	it's part of the path. The FLAGS argument contains one or more values
	381	derived from the Fcntl module that have been or'd together using the
	382	bitwise "\|" operator. The final argument, the MASK, is optional; if
	383	present, it is combined with the user's current umask for the creation
	384	mode of the file. You should usually omit this.
	385
	386	Although the traditional values of read-only, write-only, and read-write
	387	are 0, 1, and 2 respectively, this is known not to hold true on some
	388	systems. Instead, it's best to load in the appropriate constants first
	389	from the Fcntl module, which supplies the following standard flags:
	390
	391	O_RDONLY Read only
	392	O_WRONLY Write only
	393	O_RDWR Read and write
	394	O_CREAT Create the file if it doesn't exist
	395	O_EXCL Fail if the file already exists
	396	O_APPEND Append to the file
	397	O_TRUNC Truncate the file
	398	O_NONBLOCK Non-blocking access
	399
	400	Less common flags that are sometimes available on some operating
	401	systems include C<O_BINARY>, C<O_TEXT>, C<O_SHLOCK>, C<O_EXLOCK>,
	402	C<O_DEFER>, C<O_SYNC>, C<O_ASYNC>, C<O_DSYNC>, C<O_RSYNC>,
	403	C<O_NOCTTY>, C<O_NDELAY> and C<O_LARGEFILE>. Consult your open(2)
	404	manpage or its local equivalent for details. (Note: starting from
	405	Perl release 5.6 the C<O_LARGEFILE> flag, if available, is automatically
	406	added to the sysopen() flags because large files are the default.)
	407
	408	Here's how to use C<sysopen> to emulate the simple C<open> calls we had
	409	before. We'll omit the C<\|\| die $!> checks for clarity, but make sure
	410	you always check the return values in real code. These aren't quite
	411	the same, since C<open> will trim leading and trailing whitespace,
	412	but you'll get the idea.
	413
	414	To open a file for reading:
	415
	416	open(FH, "< $path");
	417	sysopen(FH, $path, O_RDONLY);
	418
	419	To open a file for writing, creating a new file if needed or else truncating
	420	an old file:
	421
	422	open(FH, "> $path");
	423	sysopen(FH, $path, O_WRONLY \| O_TRUNC \| O_CREAT);
	424
	425	To open a file for appending, creating one if necessary:
	426
	427	open(FH, ">> $path");
	428	sysopen(FH, $path, O_WRONLY \| O_APPEND \| O_CREAT);
	429
	430	To open a file for update, where the file must already exist:
	431
	432	open(FH, "+< $path");
	433	sysopen(FH, $path, O_RDWR);
	434
	435	And here are things you can do with C<sysopen> that you cannot do with
	436	a regular C<open>. As you'll see, it's just a matter of controlling the
	437	flags in the third argument.
	438
	439	To open a file for writing, creating a new file which must not previously
	440	exist:
	441
	442	sysopen(FH, $path, O_WRONLY \| O_EXCL \| O_CREAT);
	443
	444	To open a file for appending, where that file must already exist:
	445
	446	sysopen(FH, $path, O_WRONLY \| O_APPEND);
	447
	448	To open a file for update, creating a new file if necessary:
	449
	450	sysopen(FH, $path, O_RDWR \| O_CREAT);
	451
	452	To open a file for update, where that file must not already exist:
	453
	454	sysopen(FH, $path, O_RDWR \| O_EXCL \| O_CREAT);
	455
	456	To open a file without blocking, creating one if necessary:
	457
	458	sysopen(FH, $path, O_WRONLY \| O_NONBLOCK \| O_CREAT);
	459
	460	=head2 Permissions E<agrave> la mode
	461
	462	If you omit the MASK argument to C<sysopen>, Perl uses the octal value
	463	0666. The normal MASK to use for executables and directories should
	464	be 0777, and for anything else, 0666.
	465
	466	Why so permissive? Well, it isn't really. The MASK will be modified
	467	by your process's current C<umask>. A umask is a number representing
	468	I<disabled> permissions bits; that is, bits that will not be turned on
	469	in the created file's permissions field.
	470
	471	For example, if your C<umask> were 027, then the 020 part would
	472	disable the group from writing, and the 007 part would disable others
	473	from reading, writing, or executing. Under these conditions, passing
	474	C<sysopen> 0666 would create a file with mode 0640, since C<0666 & ~027>
	475	is 0640.
	476
	477	You should seldom use the MASK argument to C<sysopen()>. That takes
	478	away the user's freedom to choose what permission new files will have.
	479	Denying choice is almost always a bad thing. One exception would be for
	480	cases where sensitive or private data is being stored, such as with mail
	481	folders, cookie files, and internal temporary files.
	482
	483	=head1 Obscure Open Tricks
	484
	485	=head2 Re-Opening Files (dups)
	486
	487	Sometimes you already have a filehandle open, and want to make another
	488	handle that's a duplicate of the first one. In the shell, we place an
	489	ampersand in front of a file descriptor number when doing redirections.
	490	For example, C<< 2>&1 >> makes descriptor 2 (that's STDERR in Perl)
	491	be redirected into descriptor 1 (which is usually Perl's STDOUT).
	492	The same is essentially true in Perl: a filename that begins with an
	493	ampersand is treated instead as a file descriptor if a number, or as a
	494	filehandle if a string.
	495
	496	open(SAVEOUT, ">&SAVEERR") \|\| die "couldn't dup SAVEERR: $!";
	497	open(MHCONTEXT, "<&4") \|\| die "couldn't dup fd4: $!";
	498
	499	That means that if a function is expecting a filename, but you don't
	500	want to give it a filename because you already have the file open, you
	501	can just pass the filehandle with a leading ampersand. It's best to
	502	use a fully qualified handle though, just in case the function happens
	503	to be in a different package:
	504
	505	somefunction("&main::LOGFILE");
	506
	507	This way if somefunction() is planning on opening its argument, it can
	508	just use the already opened handle. This differs from passing a handle,
	509	because with a handle, you don't open the file. Here you have something
	510	you can pass to open.
	511
	512	If you have one of those tricky, newfangled I/O objects that the C++
	513	folks are raving about, then this doesn't work because those aren't a
	514	proper filehandle in the native Perl sense. You'll have to use fileno()
	515	to pull out the proper descriptor number, assuming you can:
	516
	517	use IO::Socket;
	518	$handle = IO::Socket::INET->new("www.perl.com:80");
	519	$fd = $handle->fileno;
	520	somefunction("&$fd"); # not an indirect function call
	521
	522	It can be easier (and certainly will be faster) just to use real
	523	filehandles though:
	524
	525	use IO::Socket;
	526	local *REMOTE = IO::Socket::INET->new("www.perl.com:80");
	527	die "can't connect" unless defined(fileno(REMOTE));
	528	somefunction("&main::REMOTE");
	529
	530	If the filehandle or descriptor number is preceded not just with a simple
	531	"&" but rather with a "&=" combination, then Perl will not create a
	532	completely new descriptor opened to the same place using the dup(2)
	533	system call. Instead, it will just make something of an alias to the
	534	existing one using the fdopen(3S) library call. This is slightly more
	535	parsimonious of systems resources, although this is less a concern
	536	these days. Here's an example of that:
	537
	538	$fd = $ENV{"MHCONTEXTFD"};
	539	open(MHCONTEXT, "<&=$fd") or die "couldn't fdopen $fd: $!";
	540
	541	If you're using magic C<< <ARGV> >>, you could even pass in as a
	542	command line argument in @ARGV something like C<"<&=$MHCONTEXTFD">,
	543	but we've never seen anyone actually do this.
	544
	545	=head2 Dispelling the Dweomer
	546
	547	Perl is more of a DWIMmer language than something like Java--where DWIM
	548	is an acronym for "do what I mean". But this principle sometimes leads
	549	to more hidden magic than one knows what to do with. In this way, Perl
	550	is also filled with I<dweomer>, an obscure word meaning an enchantment.
	551	Sometimes, Perl's DWIMmer is just too much like dweomer for comfort.
	552
	553	If magic C<open> is a bit too magical for you, you don't have to turn
	554	to C<sysopen>. To open a file with arbitrary weird characters in
	555	it, it's necessary to protect any leading and trailing whitespace.
	556	Leading whitespace is protected by inserting a C<"./"> in front of a
	557	filename that starts with whitespace. Trailing whitespace is protected
	558	by appending an ASCII NUL byte (C<"\0">) at the end of the string.
	559
	560	$file =~ s#^(\s)#./$1#;
	561	open(FH, "< $file\0") \|\| die "can't open $file: $!";
	562
	563	This assumes, of course, that your system considers dot the current
	564	working directory, slash the directory separator, and disallows ASCII
	565	NULs within a valid filename. Most systems follow these conventions,
	566	including all POSIX systems as well as proprietary Microsoft systems.
	567	The only vaguely popular system that doesn't work this way is the
	568	"Classic" Macintosh system, which uses a colon where the rest of us
	569	use a slash. Maybe C<sysopen> isn't such a bad idea after all.
	570
	571	If you want to use C<< <ARGV> >> processing in a totally boring
	572	and non-magical way, you could do this first:
	573
	574	# "Sam sat on the ground and put his head in his hands.
	575	# 'I wish I had never come here, and I don't want to see
	576	# no more magic,' he said, and fell silent."
	577	for (@ARGV) {
	578	s#^([^./])#./$1#;
	579	$_ .= "\0";
	580	}
	581	while (<>) {
	582	# now process $_
	583	}
	584
	585	But be warned that users will not appreciate being unable to use "-"
	586	to mean standard input, per the standard convention.
	587
	588	=head2 Paths as Opens
	589
	590	You've probably noticed how Perl's C<warn> and C<die> functions can
	591	produce messages like:
	592
	593	Some warning at scriptname line 29, <FH> line 7.
	594
	595	That's because you opened a filehandle FH, and had read in seven records
	596	from it. But what was the name of the file, rather than the handle?
	597
	598	If you aren't running with C<strict refs>, or if you've turned them off
	599	temporarily, then all you have to do is this:
	600
	601	open($path, "< $path") \|\| die "can't open $path: $!";
	602	while (<$path>) {
	603	# whatever
	604	}
	605
	606	Since you're using the pathname of the file as its handle,
	607	you'll get warnings more like
	608
	609	Some warning at scriptname line 29, </etc/motd> line 7.
	610
	611	=head2 Single Argument Open
	612
	613	Remember how we said that Perl's open took two arguments? That was a
	614	passive prevarication. You see, it can also take just one argument.
	615	If and only if the variable is a global variable, not a lexical, you
	616	can pass C<open> just one argument, the filehandle, and it will
	617	get the path from the global scalar variable of the same name.
	618
	619	$FILE = "/etc/motd";
	620	open FILE or die "can't open $FILE: $!";
	621	while (<FILE>) {
	622	# whatever
	623	}
	624
	625	Why is this here? Someone has to cater to the hysterical porpoises.
	626	It's something that's been in Perl since the very beginning, if not
	627	before.
	628
	629	=head2 Playing with STDIN and STDOUT
	630
	631	One clever move with STDOUT is to explicitly close it when you're done
	632	with the program.
	633
	634	END { close(STDOUT) \|\| die "can't close stdout: $!" }
	635
	636	If you don't do this, and your program fills up the disk partition due
	637	to a command line redirection, it won't report the error exit with a
	638	failure status.
	639
	640	You don't have to accept the STDIN and STDOUT you were given. You are
	641	welcome to reopen them if you'd like.
	642
	643	open(STDIN, "< datafile")
	644	\|\| die "can't open datafile: $!";
	645
	646	open(STDOUT, "> output")
	647	\|\| die "can't open output: $!";
	648
	649	And then these can be accessed directly or passed on to subprocesses.
	650	This makes it look as though the program were initially invoked
	651	with those redirections from the command line.
	652
	653	It's probably more interesting to connect these to pipes. For example:
	654
	655	$pager = $ENV{PAGER} \|\| "(less \|\| more)";
	656	open(STDOUT, "\| $pager")
	657	\|\| die "can't fork a pager: $!";
	658
	659	This makes it appear as though your program were called with its stdout
	660	already piped into your pager. You can also use this kind of thing
	661	in conjunction with an implicit fork to yourself. You might do this
	662	if you would rather handle the post processing in your own program,
	663	just in a different process:
	664
	665	head(100);
	666	while (<>) {
	667	print;
	668	}
	669
	670	sub head {
	671	my $lines = shift \|\| 20;
	672	return if $pid = open(STDOUT, "\|-"); # return if parent
	673	die "cannot fork: $!" unless defined $pid;
	674	while (<STDIN>) {
	675	last if --$lines < 0;
	676	print;
	677	}
	678	exit;
	679	}
	680
	681	This technique can be applied to repeatedly push as many filters on your
	682	output stream as you wish.
	683
	684	=head1 Other I/O Issues
	685
	686	These topics aren't really arguments related to C<open> or C<sysopen>,
	687	but they do affect what you do with your open files.
	688
	689	=head2 Opening Non-File Files
	690
	691	When is a file not a file? Well, you could say when it exists but
	692	isn't a plain file. We'll check whether it's a symbolic link first,
	693	just in case.
	694
	695	if (-l $file \|\| ! -f _) {
	696	print "$file is not a plain file\n";
	697	}
	698
	699	What other kinds of files are there than, well, files? Directories,
	700	symbolic links, named pipes, Unix-domain sockets, and block and character
	701	devices. Those are all files, too--just not I<plain> files. This isn't
	702	the same issue as being a text file. Not all text files are plain files.
	703	Not all plain files are text files. That's why there are separate C<-f>
	704	and C<-T> file tests.
	705
	706	To open a directory, you should use the C<opendir> function, then
	707	process it with C<readdir>, carefully restoring the directory
	708	name if necessary:
	709
	710	opendir(DIR, $dirname) or die "can't opendir $dirname: $!";
	711	while (defined($file = readdir(DIR))) {
	712	# do something with "$dirname/$file"
	713	}
	714	closedir(DIR);
	715
	716	If you want to process directories recursively, it's better to use the
	717	File::Find module. For example, this prints out all files recursively
	718	and adds a slash to their names if the file is a directory.
	719
	720	@ARGV = qw(.) unless @ARGV;
	721	use File::Find;
	722	find sub { print $File::Find::name, -d && '/', "\n" }, @ARGV;
	723
	724	This finds all bogus symbolic links beneath a particular directory:
	725
	726	find sub { print "$File::Find::name\n" if -l && !-e }, $dir;
	727
	728	As you see, with symbolic links, you can just pretend that it is
	729	what it points to. Or, if you want to know I<what> it points to, then
	730	C<readlink> is called for:
	731
	732	if (-l $file) {
	733	if (defined($whither = readlink($file))) {
	734	print "$file points to $whither\n";
	735	} else {
	736	print "$file points nowhere: $!\n";
	737	}
	738	}
	739
	740	=head2 Opening Named Pipes
	741
	742	Named pipes are a different matter. You pretend they're regular files,
	743	but their opens will normally block until there is both a reader and
	744	a writer. You can read more about them in L<perlipc/"Named Pipes">.
	745	Unix-domain sockets are rather different beasts as well; they're
	746	described in L<perlipc/"Unix-Domain TCP Clients and Servers">.
	747
	748	When it comes to opening devices, it can be easy and it can be tricky.
	749	We'll assume that if you're opening up a block device, you know what
	750	you're doing. The character devices are more interesting. These are
	751	typically used for modems, mice, and some kinds of printers. This is
	752	described in L<perlfaq8/"How do I read and write the serial port?">
	753	It's often enough to open them carefully:
	754
	755	sysopen(TTYIN, "/dev/ttyS1", O_RDWR \| O_NDELAY \| O_NOCTTY)
	756	# (O_NOCTTY no longer needed on POSIX systems)
	757	or die "can't open /dev/ttyS1: $!";
	758	open(TTYOUT, "+>&TTYIN")
	759	or die "can't dup TTYIN: $!";
	760
	761	$ofh = select(TTYOUT); $\| = 1; select($ofh);
	762
	763	print TTYOUT "+++at\015";
	764	$answer = <TTYIN>;
	765
	766	With descriptors that you haven't opened using C<sysopen>, such as
	767	sockets, you can set them to be non-blocking using C<fcntl>:
	768
	769	use Fcntl;
	770	my $old_flags = fcntl($handle, F_GETFL, 0)
	771	or die "can't get flags: $!";
	772	fcntl($handle, F_SETFL, $old_flags \| O_NONBLOCK)
	773	or die "can't set non blocking: $!";
	774
	775	Rather than losing yourself in a morass of twisting, turning C<ioctl>s,
	776	all dissimilar, if you're going to manipulate ttys, it's best to
	777	make calls out to the stty(1) program if you have it, or else use the
	778	portable POSIX interface. To figure this all out, you'll need to read the
	779	termios(3) manpage, which describes the POSIX interface to tty devices,
	780	and then L<POSIX>, which describes Perl's interface to POSIX. There are
	781	also some high-level modules on CPAN that can help you with these games.
	782	Check out Term::ReadKey and Term::ReadLine.
	783
	784	=head2 Opening Sockets
	785
	786	What else can you open? To open a connection using sockets, you won't use
	787	one of Perl's two open functions. See
	788	L<perlipc/"Sockets: Client/Server Communication"> for that. Here's an
	789	example. Once you have it, you can use FH as a bidirectional filehandle.
	790
	791	use IO::Socket;
	792	local *FH = IO::Socket::INET->new("www.perl.com:80");
	793
	794	For opening up a URL, the LWP modules from CPAN are just what
	795	the doctor ordered. There's no filehandle interface, but
	796	it's still easy to get the contents of a document:
	797
	798	use LWP::Simple;
	799	$doc = get('http://www.cpan.org/');
	800
	801	=head2 Binary Files
	802
	803	On certain legacy systems with what could charitably be called terminally
	804	convoluted (some would say broken) I/O models, a file isn't a file--at
	805	least, not with respect to the C standard I/O library. On these old
	806	systems whose libraries (but not kernels) distinguish between text and
	807	binary streams, to get files to behave properly you'll have to bend over
	808	backwards to avoid nasty problems. On such infelicitous systems, sockets
	809	and pipes are already opened in binary mode, and there is currently no
	810	way to turn that off. With files, you have more options.
	811
	812	Another option is to use the C<binmode> function on the appropriate
	813	handles before doing regular I/O on them:
	814
	815	binmode(STDIN);
	816	binmode(STDOUT);
	817	while (<STDIN>) { print }
	818
	819	Passing C<sysopen> a non-standard flag option will also open the file in
	820	binary mode on those systems that support it. This is the equivalent of
	821	opening the file normally, then calling C<binmode> on the handle.
	822
	823	sysopen(BINDAT, "records.data", O_RDWR \| O_BINARY)
	824	\|\| die "can't open records.data: $!";
	825
	826	Now you can use C<read> and C<print> on that handle without worrying
	827	about the non-standard system I/O library breaking your data. It's not
	828	a pretty picture, but then, legacy systems seldom are. CP/M will be
	829	with us until the end of days, and after.
	830
	831	On systems with exotic I/O systems, it turns out that, astonishingly
	832	enough, even unbuffered I/O using C<sysread> and C<syswrite> might do
	833	sneaky data mutilation behind your back.
	834
	835	while (sysread(WHENCE, $buf, 1024)) {
	836	syswrite(WHITHER, $buf, length($buf));
	837	}
	838
	839	Depending on the vicissitudes of your runtime system, even these calls
	840	may need C<binmode> or C<O_BINARY> first. Systems known to be free of
	841	such difficulties include Unix, the Mac OS, Plan 9, and Inferno.
	842
	843	=head2 File Locking
	844
	845	In a multitasking environment, you may need to be careful not to collide
	846	with other processes who want to do I/O on the same files as you
	847	are working on. You'll often need shared or exclusive locks
	848	on files for reading and writing respectively. You might just
	849	pretend that only exclusive locks exist.
	850
	851	Never use the existence of a file C<-e $file> as a locking indication,
	852	because there is a race condition between the test for the existence of
	853	the file and its creation. It's possible for another process to create
	854	a file in the slice of time between your existence check and your attempt
	855	to create the file. Atomicity is critical.
	856
	857	Perl's most portable locking interface is via the C<flock> function,
	858	whose simplicity is emulated on systems that don't directly support it
	859	such as SysV or Windows. The underlying semantics may affect how
	860	it all works, so you should learn how C<flock> is implemented on your
	861	system's port of Perl.
	862
	863	File locking I<does not> lock out another process that would like to
	864	do I/O. A file lock only locks out others trying to get a lock, not
	865	processes trying to do I/O. Because locks are advisory, if one process
	866	uses locking and another doesn't, all bets are off.
	867
	868	By default, the C<flock> call will block until a lock is granted.
	869	A request for a shared lock will be granted as soon as there is no
	870	exclusive locker. A request for an exclusive lock will be granted as
	871	soon as there is no locker of any kind. Locks are on file descriptors,
	872	not file names. You can't lock a file until you open it, and you can't
	873	hold on to a lock once the file has been closed.
	874
	875	Here's how to get a blocking shared lock on a file, typically used
	876	for reading:
	877
	878	use 5.004;
	879	use Fcntl qw(:DEFAULT :flock);
	880	open(FH, "< filename") or die "can't open filename: $!";
	881	flock(FH, LOCK_SH) or die "can't lock filename: $!";
	882	# now read from FH
	883
	884	You can get a non-blocking lock by using C<LOCK_NB>.
	885
	886	flock(FH, LOCK_SH \| LOCK_NB)
	887	or die "can't lock filename: $!";
	888
	889	This can be useful for producing more user-friendly behaviour by warning
	890	if you're going to be blocking:
	891
	892	use 5.004;
	893	use Fcntl qw(:DEFAULT :flock);
	894	open(FH, "< filename") or die "can't open filename: $!";
	895	unless (flock(FH, LOCK_SH \| LOCK_NB)) {
	896	$\| = 1;
	897	print "Waiting for lock...";
	898	flock(FH, LOCK_SH) or die "can't lock filename: $!";
	899	print "got it.\n"
	900	}
	901	# now read from FH
	902
	903	To get an exclusive lock, typically used for writing, you have to be
	904	careful. We C<sysopen> the file so it can be locked before it gets
	905	emptied. You can get a nonblocking version using C<LOCK_EX \| LOCK_NB>.
	906
	907	use 5.004;
	908	use Fcntl qw(:DEFAULT :flock);
	909	sysopen(FH, "filename", O_WRONLY \| O_CREAT)
	910	or die "can't open filename: $!";
	911	flock(FH, LOCK_EX)
	912	or die "can't lock filename: $!";
	913	truncate(FH, 0)
	914	or die "can't truncate filename: $!";
	915	# now write to FH
	916
	917	Finally, due to the uncounted millions who cannot be dissuaded from
	918	wasting cycles on useless vanity devices called hit counters, here's
	919	how to increment a number in a file safely:
	920
	921	use Fcntl qw(:DEFAULT :flock);
	922
	923	sysopen(FH, "numfile", O_RDWR \| O_CREAT)
	924	or die "can't open numfile: $!";
	925	# autoflush FH
	926	$ofh = select(FH); $\| = 1; select ($ofh);
	927	flock(FH, LOCK_EX)
	928	or die "can't write-lock numfile: $!";
	929
	930	$num = <FH> \|\| 0;
	931	seek(FH, 0, 0)
	932	or die "can't rewind numfile : $!";
	933	print FH $num+1, "\n"
	934	or die "can't write numfile: $!";
	935
	936	truncate(FH, tell(FH))
	937	or die "can't truncate numfile: $!";
	938	close(FH)
	939	or die "can't close numfile: $!";
	940
	941	=head2 IO Layers
	942
	943	In Perl 5.8.0 a new I/O framework called "PerlIO" was introduced.
	944	This is a new "plumbing" for all the I/O happening in Perl; for the
	945	most part everything will work just as it did, but PerlIO also brought
	946	in some new features such as the ability to think of I/O as "layers".
	947	One I/O layer may in addition to just moving the data also do
	948	transformations on the data. Such transformations may include
	949	compression and decompression, encryption and decryption, and transforming
	950	between various character encodings.
	951
	952	Full discussion about the features of PerlIO is out of scope for this
	953	tutorial, but here is how to recognize the layers being used:
	954
	955	=over 4
	956
	957	=item *
	958
	959	The three-(or more)-argument form of C<open> is being used and the
	960	second argument contains something else in addition to the usual
	961	C<< '<' >>, C<< '>' >>, C<< '>>' >>, C<< '\|' >> and their variants,
	962	for example:
	963
	964	open(my $fh, "<:crlf", $fn);
	965
	966	=item *
	967
	968	The two-argument form of C<binmode> is being used, for example
	969
	970	binmode($fh, ":encoding(utf16)");
	971
	972	=back
	973
	974	For more detailed discussion about PerlIO see L<PerlIO>;
	975	for more detailed discussion about Unicode and I/O see L<perluniintro>.
	976
	977	=head1 SEE ALSO
	978
	979	The C<open> and C<sysopen> functions in perlfunc(1);
	980	the system open(2), dup(2), fopen(3), and fdopen(3) manpages;
	981	the POSIX documentation.
	982
	983	=head1 AUTHOR and COPYRIGHT
	984
	985	Copyright 1998 Tom Christiansen.
	986
	987	This documentation is free; you can redistribute it and/or modify it
	988	under the same terms as Perl itself.
	989
	990	Irrespective of its distribution, all code examples in these files are
	991	hereby placed into the public domain. You are permitted and
	992	encouraged to use this code in your own programs for fun or for profit
	993	as you see fit. A simple comment in the code giving credit would be
	994	courteous but is not required.
	995
	996	=head1 HISTORY
	997
	998	First release: Sat Jan 9 08:09:11 MST 1999