perl5.git.perl.org Git - perl5.git/blame_incremental

... / ...

Commit	Line	Data
	1	=head1 NAME
	2
	3	perlsyn - Perl syntax
	4
	5	=head1 DESCRIPTION
	6
	7	A Perl script consists of a sequence of declarations and statements.
	8	The only things that need to be declared in Perl are report formats
	9	and subroutines. See the sections below for more information on those
	10	declarations. All uninitialized user-created objects are assumed to
	11	start with a null or 0 value until they are defined by some explicit
	12	operation such as assignment. (Though you can get warnings about the
	13	use of undefined values if you like.) The sequence of statements is
	14	executed just once, unlike in B<sed> and B<awk> scripts, where the
	15	sequence of statements is executed for each input line. While this means
	16	that you must explicitly loop over the lines of your input file (or
	17	files), it also means you have much more control over which files and
	18	which lines you look at. (Actually, I'm lying--it is possible to do an
	19	implicit loop with either the B<-n> or B<-p> switch. It's just not the
	20	mandatory default like it is in B<sed> and B<awk>.)
	21
	22	=head2 Declarations
	23
	24	Perl is, for the most part, a free-form language. (The only
	25	exception to this is format declarations, for obvious reasons.) Comments
	26	are indicated by the "#" character, and extend to the end of the line. If
	27	you attempt to use C</* */> C-style comments, it will be interpreted
	28	either as division or pattern matching, depending on the context, and C++
	29	C<//> comments just look like a null regular expression, so don't do
	30	that.
	31
	32	A declaration can be put anywhere a statement can, but has no effect on
	33	the execution of the primary sequence of statements--declarations all
	34	take effect at compile time. Typically all the declarations are put at
	35	the beginning or the end of the script. However, if you're using
	36	lexically-scoped private variables created with my(), you'll have to make sure
	37	your format or subroutine definition is within the same block scope
	38	as the my if you expect to to be able to access those private variables.
	39
	40	Declaring a subroutine allows a subroutine name to be used as if it were a
	41	list operator from that point forward in the program. You can declare a
	42	subroutine (prototyped to take one scalar parameter) without defining it by saying just:
	43
	44	sub myname ($);
	45	$me = myname $0 or die "can't get myname";
	46
	47	Note that it functions as a list operator though, not as a unary
	48	operator, so be careful to use C<or> instead of C<\|\|> there.
	49
	50	Subroutines declarations can also be loaded up with the C<require> statement
	51	or both loaded and imported into your namespace with a C<use> statement.
	52	See L<perlmod> for details on this.
	53
	54	A statement sequence may contain declarations of lexically-scoped
	55	variables, but apart from declaring a variable name, the declaration acts
	56	like an ordinary statement, and is elaborated within the sequence of
	57	statements as if it were an ordinary statement. That means it actually
	58	has both compile-time and run-time effects.
	59
	60	=head2 Simple statements
	61
	62	The only kind of simple statement is an expression evaluated for its
	63	side effects. Every simple statement must be terminated with a
	64	semicolon, unless it is the final statement in a block, in which case
	65	the semicolon is optional. (A semicolon is still encouraged there if the
	66	block takes up more than one line, since you may eventually add another line.)
	67	Note that there are some operators like C<eval {}> and C<do {}> that look
	68	like compound statements, but aren't (they're just TERMs in an expression),
	69	and thus need an explicit termination if used as the last item in a statement.
	70
	71	Any simple statement may optionally be followed by a I<SINGLE> modifier,
	72	just before the terminating semicolon (or block ending). The possible
	73	modifiers are:
	74
	75	if EXPR
	76	unless EXPR
	77	while EXPR
	78	until EXPR
	79
	80	The C<if> and C<unless> modifiers have the expected semantics,
	81	presuming you're a speaker of English. The C<while> and C<until>
	82	modifiers also have the usual "while loop" semantics (conditional
	83	evaluated first), except when applied to a do-BLOCK (or to the
	84	now-deprecated do-SUBROUTINE statement), in which case the block
	85	executes once before the conditional is evaluated. This is so that you
	86	can write loops like:
	87
	88	do {
	89	$line = <STDIN>;
	90	...
	91	} until $line eq ".\n";
	92
	93	See L<perlfunc/do>. Note also that the loop control
	94	statements described later will I<NOT> work in this construct, since
	95	modifiers don't take loop labels. Sorry. You can always wrap
	96	another block around it to do that sort of thing.
	97
	98	=head2 Compound statements
	99
	100	In Perl, a sequence of statements that defines a scope is called a block.
	101	Sometimes a block is delimited by the file containing it (in the case
	102	of a required file, or the program as a whole), and sometimes a block
	103	is delimited by the extent of a string (in the case of an eval).
	104
	105	But generally, a block is delimited by curly brackets, also known as braces.
	106	We will call this syntactic construct a BLOCK.
	107
	108	The following compound statements may be used to control flow:
	109
	110	if (EXPR) BLOCK
	111	if (EXPR) BLOCK else BLOCK
	112	if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
	113	LABEL while (EXPR) BLOCK
	114	LABEL while (EXPR) BLOCK continue BLOCK
	115	LABEL for (EXPR; EXPR; EXPR) BLOCK
	116	LABEL foreach VAR (LIST) BLOCK
	117	LABEL BLOCK continue BLOCK
	118
	119	Note that, unlike C and Pascal, these are defined in terms of BLOCKs,
	120	not statements. This means that the curly brackets are I<required>--no
	121	dangling statements allowed. If you want to write conditionals without
	122	curly brackets there are several other ways to do it. The following
	123	all do the same thing:
	124
	125	if (!open(FOO)) { die "Can't open $FOO: $!"; }
	126	die "Can't open $FOO: $!" unless open(FOO);
	127	open(FOO) or die "Can't open $FOO: $!"; # FOO or bust!
	128	open(FOO) ? 'hi mom' : die "Can't open $FOO: $!";
	129	# a bit exotic, that last one
	130
	131	The C<if> statement is straightforward. Since BLOCKs are always
	132	bounded by curly brackets, there is never any ambiguity about which
	133	C<if> an C<else> goes with. If you use C<unless> in place of C<if>,
	134	the sense of the test is reversed.
	135
	136	The C<while> statement executes the block as long as the expression is
	137	true (does not evaluate to the null string or 0 or "0"). The LABEL is
	138	optional, and if present, consists of an identifier followed by a colon.
	139	The LABEL identifies the loop for the loop control statements C<next>,
	140	C<last>, and C<redo>. If the LABEL is omitted, the loop control statement
	141	refers to the innermost enclosing loop. This may include dynamically
	142	looking back your call-stack at run time to find the LABEL. Such
	143	desperate behavior triggers a warning if you use the B<-w> flag.
	144
	145	If there is a C<continue> BLOCK, it is always executed just before the
	146	conditional is about to be evaluated again, just like the third part of a
	147	C<for> loop in C. Thus it can be used to increment a loop variable, even
	148	when the loop has been continued via the C<next> statement (which is
	149	similar to the C C<continue> statement).
	150
	151	=head2 Loop Control
	152
	153	The C<next> command is like the C<continue> statement in C; it starts
	154	the next iteration of the loop:
	155
	156	LINE: while (<STDIN>) {
	157	next LINE if /^#/; # discard comments
	158	...
	159	}
	160
	161	The C<last> command is like the C<break> statement in C (as used in
	162	loops); it immediately exits the loop in question. The
	163	C<continue> block, if any, is not executed:
	164
	165	LINE: while (<STDIN>) {
	166	last LINE if /^$/; # exit when done with header
	167	...
	168	}
	169
	170	The C<redo> command restarts the loop block without evaluating the
	171	conditional again. The C<continue> block, if any, is I<not> executed.
	172	This command is normally used by programs that want to lie to themselves
	173	about what was just input.
	174
	175	For example, when processing a file like F</etc/termcap>.
	176	If your input lines might end in backslashes to indicate continuation, you
	177	want to skip ahead and get the next record.
	178
	179	while (<>) {
	180	chomp;
	181	if (s/\\$//) {
	182	$_ .= <>;
	183	redo unless eof();
	184	}
	185	# now process $_
	186	}
	187
	188	which is Perl short-hand for the more explicitly written version:
	189
	190	LINE: while ($line = <ARGV>) {
	191	chomp($line);
	192	if ($line =~ s/\\$//) {
	193	$line .= <ARGV>;
	194	redo LINE unless eof(); # not eof(ARGV)!
	195	}
	196	# now process $line
	197	}
	198
	199	Or here's a simpleminded Pascal comment stripper (warning: assumes no { or } in strings).
	200
	201	LINE: while (<STDIN>) {
	202	while (s\|({.}.){.*}\|$1 \|) {}
	203	s\|{.*}\| \|;
	204	if (s\|{.*\| \|) {
	205	$front = $_;
	206	while (<STDIN>) {
	207	if (/}/) { # end of comment?
	208	s\|^\|$front{\|;
	209	redo LINE;
	210	}
	211	}
	212	}
	213	print;
	214	}
	215
	216	Note that if there were a C<continue> block on the above code, it would get
	217	executed even on discarded lines.
	218
	219	If the word C<while> is replaced by the word C<until>, the sense of the
	220	test is reversed, but the conditional is still tested before the first
	221	iteration.
	222
	223	In either the C<if> or the C<while> statement, you may replace "(EXPR)"
	224	with a BLOCK, and the conditional is true if the value of the last
	225	statement in that block is true. While this "feature" continues to work in
	226	version 5, it has been deprecated, so please change any occurrences of "if BLOCK" to
	227	"if (do BLOCK)".
	228
	229	=head2 For Loops
	230
	231	Perl's C-style C<for> loop works exactly like the corresponding C<while> loop;
	232	that means that this:
	233
	234	for ($i = 1; $i < 10; $i++) {
	235	...
	236	}
	237
	238	is the same as this:
	239
	240	$i = 1;
	241	while ($i < 10) {
	242	...
	243	} continue {
	244	$i++;
	245	}
	246
	247	Besides the normal array index looping, C<for> can lend itself
	248	to many other interesting applications. Here's one that avoids the
	249	problem you get into if you explicitly test for end-of-file on
	250	an interactive file descriptor causing your program to appear to
	251	hang.
	252
	253	$on_a_tty = -t STDIN && -t STDOUT;
	254	sub prompt { print "yes? " if $on_a_tty }
	255	for ( prompt(); <STDIN>; prompt() ) {
	256	# do something
	257	}
	258
	259	=head2 Foreach Loops
	260
	261	The C<foreach> loop iterates over a normal list value and sets the
	262	variable VAR to be each element of the list in turn. The variable is
	263	implicitly local to the loop and regains its former value upon exiting the
	264	loop. If the variable was previously declared with C<my>, it uses that
	265	variable instead of the global one, but it's still localized to the loop.
	266	This can cause problems if you have subroutine or format declarations
	267	within that block's scope.
	268
	269	The C<foreach> keyword is actually a synonym for the C<for> keyword, so
	270	you can use C<foreach> for readability or C<for> for brevity. If VAR is
	271	omitted, $_ is set to each value. If LIST is an actual array (as opposed
	272	to an expression returning a list value), you can modify each element of
	273	the array by modifying VAR inside the loop. That's because the C<foreach>
	274	loop index variable is an implicit alias for each item in the list that
	275	you're looping over.
	276
	277	Examples:
	278
	279	for (@ary) { s/foo/bar/ }
	280
	281	foreach $elem (@elements) {
	282	$elem *= 2;
	283	}
	284
	285	for $count (10,9,8,7,6,5,4,3,2,1,'BOOM') {
	286	print $count, "\n"; sleep(1);
	287	}
	288
	289	for (1..15) { print "Merry Christmas\n"; }
	290
	291	foreach $item (split(/:[\\\n:]*/, $ENV{TERMCAP})) {
	292	print "Item: $item\n";
	293	}
	294
	295	Here's how a C programmer might code up a particular algorithm in Perl:
	296
	297	for ($i = 0; $i < @ary1; $i++) {
	298	for ($j = 0; $j < @ary2; $j++) {
	299	if ($ary1[$i] > $ary2[$j]) {
	300	last; # can't go to outer :-(
	301	}
	302	$ary1[$i] += $ary2[$j];
	303	}
	304	# this is where that last takes me
	305	}
	306
	307	Whereas here's how a Perl programmer more comfortable with the idiom might
	308	do it:
	309
	310	OUTER: foreach $wid (@ary1) {
	311	INNER: foreach $jet (@ary2) {
	312	next OUTER if $wid > $jet;
	313	$wid += $jet;
	314	}
	315	}
	316
	317	See how much easier this is? It's cleaner, safer, and faster. It's
	318	cleaner because it's less noisy. It's safer because if code gets added
	319	between the inner and outer loops later on, the new code won't be
	320	accidentally executed, the C<next> explicitly iterates the other loop
	321	rather than merely terminating the inner one. And it's faster because
	322	Perl executes a C<foreach> statement more rapidly than it would the
	323	equivalent C<for> loop.
	324
	325	=head2 Basic BLOCKs and Switch Statements
	326
	327	A BLOCK by itself (labeled or not) is semantically equivalent to a loop
	328	that executes once. Thus you can use any of the loop control
	329	statements in it to leave or restart the block. (Note that this
	330	is I<NOT> true in C<eval{}>, C<sub{}>, or contrary to popular belief C<do{}> blocks,
	331	which do I<NOT> count as loops.) The C<continue> block
	332	is optional.
	333
	334	The BLOCK construct is particularly nice for doing case
	335	structures.
	336
	337	SWITCH: {
	338	if (/^abc/) { $abc = 1; last SWITCH; }
	339	if (/^def/) { $def = 1; last SWITCH; }
	340	if (/^xyz/) { $xyz = 1; last SWITCH; }
	341	$nothing = 1;
	342	}
	343
	344	There is no official switch statement in Perl, because there are
	345	already several ways to write the equivalent. In addition to the
	346	above, you could write
	347
	348	SWITCH: {
	349	$abc = 1, last SWITCH if /^abc/;
	350	$def = 1, last SWITCH if /^def/;
	351	$xyz = 1, last SWITCH if /^xyz/;
	352	$nothing = 1;
	353	}
	354
	355	(That's actually not as strange as it looks once you realize that you can
	356	use loop control "operators" within an expression, That's just the normal
	357	C comma operator.)
	358
	359	or
	360
	361	SWITCH: {
	362	/^abc/ && do { $abc = 1; last SWITCH; };
	363	/^def/ && do { $def = 1; last SWITCH; };
	364	/^xyz/ && do { $xyz = 1; last SWITCH; };
	365	$nothing = 1;
	366	}
	367
	368	or formatted so it stands out more as a "proper" switch statement:
	369
	370	SWITCH: {
	371	/^abc/ && do {
	372	$abc = 1;
	373	last SWITCH;
	374	};
	375
	376	/^def/ && do {
	377	$def = 1;
	378	last SWITCH;
	379	};
	380
	381	/^xyz/ && do {
	382	$xyz = 1;
	383	last SWITCH;
	384	};
	385	$nothing = 1;
	386	}
	387
	388	or
	389
	390	SWITCH: {
	391	/^abc/ and $abc = 1, last SWITCH;
	392	/^def/ and $def = 1, last SWITCH;
	393	/^xyz/ and $xyz = 1, last SWITCH;
	394	$nothing = 1;
	395	}
	396
	397	or even, horrors,
	398
	399	if (/^abc/)
	400	{ $abc = 1 }
	401	elsif (/^def/)
	402	{ $def = 1 }
	403	elsif (/^xyz/)
	404	{ $xyz = 1 }
	405	else
	406	{ $nothing = 1 }
	407
	408
	409	A common idiom for a switch statement is to use C<foreach>'s aliasing to make
	410	a temporary assignment to $_ for convenient matching:
	411
	412	SWITCH: for ($where) {
	413	/In Card Names/ && do { push @flags, '-e'; last; };
	414	/Anywhere/ && do { push @flags, '-h'; last; };
	415	/In Rulings/ && do { last; };
	416	die "unknown value for form variable where: `$where'";
	417	}
	418
	419	Another interesting approach to a switch statement is arrange
	420	for a C<do> block to return the proper value:
	421
	422	$amode = do {
	423	if ($flag & O_RDONLY) { "r" }
	424	elsif ($flag & O_WRONLY) { ($flag & O_APPEND) ? "a" : "w" }
	425	elsif ($flag & O_RDWR) {
	426	if ($flag & O_CREAT) { "w+" }
	427	else { ($flag & O_APPEND) ? "a+" : "r+" }
	428	}
	429	};
	430
	431	=head2 Goto
	432
	433	Although not for the faint of heart, Perl does support a C<goto> statement.
	434	A loop's LABEL is not actually a valid target for a C<goto>;
	435	it's just the name of the loop. There are three forms: goto-LABEL,
	436	goto-EXPR, and goto-&NAME.
	437
	438	The goto-LABEL form finds the statement labeled with LABEL and resumes
	439	execution there. It may not be used to go into any construct that
	440	requires initialization, such as a subroutine or a foreach loop. It
	441	also can't be used to go into a construct that is optimized away. It
	442	can be used to go almost anywhere else within the dynamic scope,
	443	including out of subroutines, but it's usually better to use some other
	444	construct such as last or die. The author of Perl has never felt the
	445	need to use this form of goto (in Perl, that is--C is another matter).
	446
	447	The goto-EXPR form expects a label name, whose scope will be resolved
	448	dynamically. This allows for computed gotos per FORTRAN, but isn't
	449	necessarily recommended if you're optimizing for maintainability:
	450
	451	goto ("FOO", "BAR", "GLARCH")[$i];
	452
	453	The goto-&NAME form is highly magical, and substitutes a call to the
	454	named subroutine for the currently running subroutine. This is used by
	455	AUTOLOAD() subroutines that wish to load another subroutine and then
	456	pretend that the other subroutine had been called in the first place
	457	(except that any modifications to @_ in the current subroutine are
	458	propagated to the other subroutine.) After the C<goto>, not even caller()
	459	will be able to tell that this routine was called first.
	460
	461	In almost all cases like this, it's usually a far, far better idea to use the
	462	structured control flow mechanisms of C<next>, C<last>, or C<redo> instead of
	463	resorting to a C<goto>. For certain applications, the catch and throw pair of
	464	C<eval{}> and die() for exception processing can also be a prudent approach.
	465
	466	=head2 PODs: Embedded Documentation
	467
	468	Perl has a mechanism for intermixing documentation with source code.
	469	While it's expecting the beginning of a new statement, if the compiler
	470	encounters a line that begins with an equal sign and a word, like this
	471
	472	=head1 Here There Be Pods!
	473
	474	Then that text and all remaining text up through and including a line
	475	beginning with C<=cut> will be ignored. The format of the intervening
	476	text is described in L<perlpod>.
	477
	478	This allows you to intermix your source code
	479	and your documentation text freely, as in
	480
	481	=item snazzle($)
	482
	483	The snazzle() function will behave in the most spectacular
	484	form that you can possibly imagine, not even excepting
	485	cybernetic pyrotechnics.
	486
	487	=cut back to the compiler, nuff of this pod stuff!
	488
	489	sub snazzle($) {
	490	my $thingie = shift;
	491	.........
	492	}
	493
	494	Note that pod translators should only look at paragraphs beginning
	495	with a pod directive (it makes parsing easier), whereas the compiler
	496	actually knows to look for pod escapes even in the middle of a
	497	paragraph. This means that the following secret stuff will be
	498	ignored by both the compiler and the translators.
	499
	500	$a=3;
	501	=secret stuff
	502	warn "Neither POD nor CODE!?"
	503	=cut back
	504	print "got $a\n";
	505
	506	You probably shouldn't rely upon the warn() being podded out forever.
	507	Not all pod translators are well-behaved in this regard, and perhaps
	508	the compiler will become pickier.