perl5.git.perl.org Git - perl5.git/blame_incremental

... / ...

Commit	Line	Data
	1	=head1 NAME
	2
	3	perlsyn - Perl syntax
	4
	5	=head1 DESCRIPTION
	6
	7	A Perl script consists of a sequence of declarations and statements.
	8	The only things that need to be declared in Perl are report formats
	9	and subroutines. See the sections below for more information on those
	10	declarations. All uninitialized user-created objects are assumed to
	11	start with a null or 0 value until they are defined by some explicit
	12	operation such as assignment. (Though you can get warnings about the
	13	use of undefined values if you like.) The sequence of statements is
	14	executed just once, unlike in B<sed> and B<awk> scripts, where the
	15	sequence of statements is executed for each input line. While this means
	16	that you must explicitly loop over the lines of your input file (or
	17	files), it also means you have much more control over which files and
	18	which lines you look at. (Actually, I'm lying--it is possible to do an
	19	implicit loop with either the B<-n> or B<-p> switch. It's just not the
	20	mandatory default like it is in B<sed> and B<awk>.)
	21
	22	=head2 Declarations
	23
	24	Perl is, for the most part, a free-form language. (The only
	25	exception to this is format declarations, for obvious reasons.) Comments
	26	are indicated by the "#" character, and extend to the end of the line. If
	27	you attempt to use C</* */> C-style comments, it will be interpreted
	28	either as division or pattern matching, depending on the context, and C++
	29	C<//> comments just look like a null regular expression, so don't do
	30	that.
	31
	32	A declaration can be put anywhere a statement can, but has no effect on
	33	the execution of the primary sequence of statements--declarations all
	34	take effect at compile time. Typically all the declarations are put at
	35	the beginning or the end of the script. However, if you're using
	36	lexically-scoped private variables created with my(), you'll have to make sure
	37	your format or subroutine definition is within the same block scope
	38	as the my if you expect to be able to access those private variables.
	39
	40	Declaring a subroutine allows a subroutine name to be used as if it were a
	41	list operator from that point forward in the program. You can declare a
	42	subroutine (prototyped to take one scalar parameter) without defining it by saying just:
	43
	44	sub myname ($);
	45	$me = myname $0 or die "can't get myname";
	46
	47	Note that it functions as a list operator though, not as a unary
	48	operator, so be careful to use C<or> instead of C<\|\|> there.
	49
	50	Subroutines declarations can also be loaded up with the C<require> statement
	51	or both loaded and imported into your namespace with a C<use> statement.
	52	See L<perlmod> for details on this.
	53
	54	A statement sequence may contain declarations of lexically-scoped
	55	variables, but apart from declaring a variable name, the declaration acts
	56	like an ordinary statement, and is elaborated within the sequence of
	57	statements as if it were an ordinary statement. That means it actually
	58	has both compile-time and run-time effects.
	59
	60	=head2 Simple statements
	61
	62	The only kind of simple statement is an expression evaluated for its
	63	side effects. Every simple statement must be terminated with a
	64	semicolon, unless it is the final statement in a block, in which case
	65	the semicolon is optional. (A semicolon is still encouraged there if the
	66	block takes up more than one line, because you may eventually add another line.)
	67	Note that there are some operators like C<eval {}> and C<do {}> that look
	68	like compound statements, but aren't (they're just TERMs in an expression),
	69	and thus need an explicit termination if used as the last item in a statement.
	70
	71	Any simple statement may optionally be followed by a I<SINGLE> modifier,
	72	just before the terminating semicolon (or block ending). The possible
	73	modifiers are:
	74
	75	if EXPR
	76	unless EXPR
	77	while EXPR
	78	until EXPR
	79
	80	The C<if> and C<unless> modifiers have the expected semantics,
	81	presuming you're a speaker of English. The C<while> and C<until>
	82	modifiers also have the usual "while loop" semantics (conditional
	83	evaluated first), except when applied to a do-BLOCK (or to the
	84	now-deprecated do-SUBROUTINE statement), in which case the block
	85	executes once before the conditional is evaluated. This is so that you
	86	can write loops like:
	87
	88	do {
	89	$line = <STDIN>;
	90	...
	91	} until $line eq ".\n";
	92
	93	See L<perlfunc/do>. Note also that the loop control
	94	statements described later will I<NOT> work in this construct, because
	95	modifiers don't take loop labels. Sorry. You can always wrap
	96	another block around it to do that sort of thing.
	97
	98	=head2 Compound statements
	99
	100	In Perl, a sequence of statements that defines a scope is called a block.
	101	Sometimes a block is delimited by the file containing it (in the case
	102	of a required file, or the program as a whole), and sometimes a block
	103	is delimited by the extent of a string (in the case of an eval).
	104
	105	But generally, a block is delimited by curly brackets, also known as braces.
	106	We will call this syntactic construct a BLOCK.
	107
	108	The following compound statements may be used to control flow:
	109
	110	if (EXPR) BLOCK
	111	if (EXPR) BLOCK else BLOCK
	112	if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
	113	LABEL while (EXPR) BLOCK
	114	LABEL while (EXPR) BLOCK continue BLOCK
	115	LABEL for (EXPR; EXPR; EXPR) BLOCK
	116	LABEL foreach VAR (LIST) BLOCK
	117	LABEL BLOCK continue BLOCK
	118
	119	Note that, unlike C and Pascal, these are defined in terms of BLOCKs,
	120	not statements. This means that the curly brackets are I<required>--no
	121	dangling statements allowed. If you want to write conditionals without
	122	curly brackets there are several other ways to do it. The following
	123	all do the same thing:
	124
	125	if (!open(FOO)) { die "Can't open $FOO: $!"; }
	126	die "Can't open $FOO: $!" unless open(FOO);
	127	open(FOO) or die "Can't open $FOO: $!"; # FOO or bust!
	128	open(FOO) ? 'hi mom' : die "Can't open $FOO: $!";
	129	# a bit exotic, that last one
	130
	131	The C<if> statement is straightforward. Because BLOCKs are always
	132	bounded by curly brackets, there is never any ambiguity about which
	133	C<if> an C<else> goes with. If you use C<unless> in place of C<if>,
	134	the sense of the test is reversed.
	135
	136	The C<while> statement executes the block as long as the expression is
	137	true (does not evaluate to the null string or 0 or "0"). The LABEL is
	138	optional, and if present, consists of an identifier followed by a colon.
	139	The LABEL identifies the loop for the loop control statements C<next>,
	140	C<last>, and C<redo>. If the LABEL is omitted, the loop control statement
	141	refers to the innermost enclosing loop. This may include dynamically
	142	looking back your call-stack at run time to find the LABEL. Such
	143	desperate behavior triggers a warning if you use the B<-w> flag.
	144
	145	If there is a C<continue> BLOCK, it is always executed just before the
	146	conditional is about to be evaluated again, just like the third part of a
	147	C<for> loop in C. Thus it can be used to increment a loop variable, even
	148	when the loop has been continued via the C<next> statement (which is
	149	similar to the C C<continue> statement).
	150
	151	=head2 Loop Control
	152
	153	The C<next> command is like the C<continue> statement in C; it starts
	154	the next iteration of the loop:
	155
	156	LINE: while (<STDIN>) {
	157	next LINE if /^#/; # discard comments
	158	...
	159	}
	160
	161	The C<last> command is like the C<break> statement in C (as used in
	162	loops); it immediately exits the loop in question. The
	163	C<continue> block, if any, is not executed:
	164
	165	LINE: while (<STDIN>) {
	166	last LINE if /^$/; # exit when done with header
	167	...
	168	}
	169
	170	The C<redo> command restarts the loop block without evaluating the
	171	conditional again. The C<continue> block, if any, is I<not> executed.
	172	This command is normally used by programs that want to lie to themselves
	173	about what was just input.
	174
	175	For example, when processing a file like F</etc/termcap>.
	176	If your input lines might end in backslashes to indicate continuation, you
	177	want to skip ahead and get the next record.
	178
	179	while (<>) {
	180	chomp;
	181	if (s/\\$//) {
	182	$_ .= <>;
	183	redo unless eof();
	184	}
	185	# now process $_
	186	}
	187
	188	which is Perl short-hand for the more explicitly written version:
	189
	190	LINE: while ($line = <ARGV>) {
	191	chomp($line);
	192	if ($line =~ s/\\$//) {
	193	$line .= <ARGV>;
	194	redo LINE unless eof(); # not eof(ARGV)!
	195	}
	196	# now process $line
	197	}
	198
	199	Or here's a simpleminded Pascal comment stripper (warning: assumes no { or } in strings).
	200
	201	LINE: while (<STDIN>) {
	202	while (s\|({.}.){.*}\|$1 \|) {}
	203	s\|{.*}\| \|;
	204	if (s\|{.*\| \|) {
	205	$front = $_;
	206	while (<STDIN>) {
	207	if (/}/) { # end of comment?
	208	s\|^\|$front{\|;
	209	redo LINE;
	210	}
	211	}
	212	}
	213	print;
	214	}
	215
	216	Note that if there were a C<continue> block on the above code, it would get
	217	executed even on discarded lines.
	218
	219	If the word C<while> is replaced by the word C<until>, the sense of the
	220	test is reversed, but the conditional is still tested before the first
	221	iteration.
	222
	223	The form C<while/if BLOCK BLOCK>, available in Perl 4, is no longer
	224	available. Replace any occurrence of C<if BLOCK> by C<if (do BLOCK)>.
	225
	226	=head2 For Loops
	227
	228	Perl's C-style C<for> loop works exactly like the corresponding C<while> loop;
	229	that means that this:
	230
	231	for ($i = 1; $i < 10; $i++) {
	232	...
	233	}
	234
	235	is the same as this:
	236
	237	$i = 1;
	238	while ($i < 10) {
	239	...
	240	} continue {
	241	$i++;
	242	}
	243
	244	(There is one minor difference: The first form implies a lexical scope
	245	for variables declared with C<my> in the initialization expression.)
	246
	247	Besides the normal array index looping, C<for> can lend itself
	248	to many other interesting applications. Here's one that avoids the
	249	problem you get into if you explicitly test for end-of-file on
	250	an interactive file descriptor causing your program to appear to
	251	hang.
	252
	253	$on_a_tty = -t STDIN && -t STDOUT;
	254	sub prompt { print "yes? " if $on_a_tty }
	255	for ( prompt(); <STDIN>; prompt() ) {
	256	# do something
	257	}
	258
	259	=head2 Foreach Loops
	260
	261	The C<foreach> loop iterates over a normal list value and sets the
	262	variable VAR to be each element of the list in turn. If the variable
	263	is preceded with the keyword C<my>, then it is lexically scoped, and
	264	is therefore visible only within the loop. Otherwise, the variable is
	265	implicitly local to the loop and regains its former value upon exiting
	266	the loop. If the variable was previously declared with C<my>, it uses
	267	that variable instead of the global one, but it's still localized to
	268	the loop. (Note that a lexically scoped variable can cause problems
	269	with you have subroutine or format declarations.)
	270
	271	The C<foreach> keyword is actually a synonym for the C<for> keyword, so
	272	you can use C<foreach> for readability or C<for> for brevity. If VAR is
	273	omitted, $_ is set to each value. If LIST is an actual array (as opposed
	274	to an expression returning a list value), you can modify each element of
	275	the array by modifying VAR inside the loop. That's because the C<foreach>
	276	loop index variable is an implicit alias for each item in the list that
	277	you're looping over.
	278
	279	Examples:
	280
	281	for (@ary) { s/foo/bar/ }
	282
	283	foreach my $elem (@elements) {
	284	$elem *= 2;
	285	}
	286
	287	for $count (10,9,8,7,6,5,4,3,2,1,'BOOM') {
	288	print $count, "\n"; sleep(1);
	289	}
	290
	291	for (1..15) { print "Merry Christmas\n"; }
	292
	293	foreach $item (split(/:[\\\n:]*/, $ENV{TERMCAP})) {
	294	print "Item: $item\n";
	295	}
	296
	297	Here's how a C programmer might code up a particular algorithm in Perl:
	298
	299	for (my $i = 0; $i < @ary1; $i++) {
	300	for (my $j = 0; $j < @ary2; $j++) {
	301	if ($ary1[$i] > $ary2[$j]) {
	302	last; # can't go to outer :-(
	303	}
	304	$ary1[$i] += $ary2[$j];
	305	}
	306	# this is where that last takes me
	307	}
	308
	309	Whereas here's how a Perl programmer more comfortable with the idiom might
	310	do it:
	311
	312	OUTER: foreach my $wid (@ary1) {
	313	INNER: foreach my $jet (@ary2) {
	314	next OUTER if $wid > $jet;
	315	$wid += $jet;
	316	}
	317	}
	318
	319	See how much easier this is? It's cleaner, safer, and faster. It's
	320	cleaner because it's less noisy. It's safer because if code gets added
	321	between the inner and outer loops later on, the new code won't be
	322	accidentally executed. The C<next> explicitly iterates the other loop
	323	rather than merely terminating the inner one. And it's faster because
	324	Perl executes a C<foreach> statement more rapidly than it would the
	325	equivalent C<for> loop.
	326
	327	=head2 Basic BLOCKs and Switch Statements
	328
	329	A BLOCK by itself (labeled or not) is semantically equivalent to a
	330	loop that executes once. Thus you can use any of the loop control
	331	statements in it to leave or restart the block. (Note that this is
	332	I<NOT> true in C<eval{}>, C<sub{}>, or contrary to popular belief
	333	C<do{}> blocks, which do I<NOT> count as loops.) The C<continue>
	334	block is optional.
	335
	336	The BLOCK construct is particularly nice for doing case
	337	structures.
	338
	339	SWITCH: {
	340	if (/^abc/) { $abc = 1; last SWITCH; }
	341	if (/^def/) { $def = 1; last SWITCH; }
	342	if (/^xyz/) { $xyz = 1; last SWITCH; }
	343	$nothing = 1;
	344	}
	345
	346	There is no official switch statement in Perl, because there are
	347	already several ways to write the equivalent. In addition to the
	348	above, you could write
	349
	350	SWITCH: {
	351	$abc = 1, last SWITCH if /^abc/;
	352	$def = 1, last SWITCH if /^def/;
	353	$xyz = 1, last SWITCH if /^xyz/;
	354	$nothing = 1;
	355	}
	356
	357	(That's actually not as strange as it looks once you realize that you can
	358	use loop control "operators" within an expression, That's just the normal
	359	C comma operator.)
	360
	361	or
	362
	363	SWITCH: {
	364	/^abc/ && do { $abc = 1; last SWITCH; };
	365	/^def/ && do { $def = 1; last SWITCH; };
	366	/^xyz/ && do { $xyz = 1; last SWITCH; };
	367	$nothing = 1;
	368	}
	369
	370	or formatted so it stands out more as a "proper" switch statement:
	371
	372	SWITCH: {
	373	/^abc/ && do {
	374	$abc = 1;
	375	last SWITCH;
	376	};
	377
	378	/^def/ && do {
	379	$def = 1;
	380	last SWITCH;
	381	};
	382
	383	/^xyz/ && do {
	384	$xyz = 1;
	385	last SWITCH;
	386	};
	387	$nothing = 1;
	388	}
	389
	390	or
	391
	392	SWITCH: {
	393	/^abc/ and $abc = 1, last SWITCH;
	394	/^def/ and $def = 1, last SWITCH;
	395	/^xyz/ and $xyz = 1, last SWITCH;
	396	$nothing = 1;
	397	}
	398
	399	or even, horrors,
	400
	401	if (/^abc/)
	402	{ $abc = 1 }
	403	elsif (/^def/)
	404	{ $def = 1 }
	405	elsif (/^xyz/)
	406	{ $xyz = 1 }
	407	else
	408	{ $nothing = 1 }
	409
	410
	411	A common idiom for a switch statement is to use C<foreach>'s aliasing to make
	412	a temporary assignment to $_ for convenient matching:
	413
	414	SWITCH: for ($where) {
	415	/In Card Names/ && do { push @flags, '-e'; last; };
	416	/Anywhere/ && do { push @flags, '-h'; last; };
	417	/In Rulings/ && do { last; };
	418	die "unknown value for form variable where: `$where'";
	419	}
	420
	421	Another interesting approach to a switch statement is arrange
	422	for a C<do> block to return the proper value:
	423
	424	$amode = do {
	425	if ($flag & O_RDONLY) { "r" }
	426	elsif ($flag & O_WRONLY) { ($flag & O_APPEND) ? "a" : "w" }
	427	elsif ($flag & O_RDWR) {
	428	if ($flag & O_CREAT) { "w+" }
	429	else { ($flag & O_APPEND) ? "a+" : "r+" }
	430	}
	431	};
	432
	433	=head2 Goto
	434
	435	Although not for the faint of heart, Perl does support a C<goto> statement.
	436	A loop's LABEL is not actually a valid target for a C<goto>;
	437	it's just the name of the loop. There are three forms: goto-LABEL,
	438	goto-EXPR, and goto-&NAME.
	439
	440	The goto-LABEL form finds the statement labeled with LABEL and resumes
	441	execution there. It may not be used to go into any construct that
	442	requires initialization, such as a subroutine or a foreach loop. It
	443	also can't be used to go into a construct that is optimized away. It
	444	can be used to go almost anywhere else within the dynamic scope,
	445	including out of subroutines, but it's usually better to use some other
	446	construct such as last or die. The author of Perl has never felt the
	447	need to use this form of goto (in Perl, that is--C is another matter).
	448
	449	The goto-EXPR form expects a label name, whose scope will be resolved
	450	dynamically. This allows for computed gotos per FORTRAN, but isn't
	451	necessarily recommended if you're optimizing for maintainability:
	452
	453	goto ("FOO", "BAR", "GLARCH")[$i];
	454
	455	The goto-&NAME form is highly magical, and substitutes a call to the
	456	named subroutine for the currently running subroutine. This is used by
	457	AUTOLOAD() subroutines that wish to load another subroutine and then
	458	pretend that the other subroutine had been called in the first place
	459	(except that any modifications to @_ in the current subroutine are
	460	propagated to the other subroutine.) After the C<goto>, not even caller()
	461	will be able to tell that this routine was called first.
	462
	463	In almost all cases like this, it's usually a far, far better idea to use the
	464	structured control flow mechanisms of C<next>, C<last>, or C<redo> instead of
	465	resorting to a C<goto>. For certain applications, the catch and throw pair of
	466	C<eval{}> and die() for exception processing can also be a prudent approach.
	467
	468	=head2 PODs: Embedded Documentation
	469
	470	Perl has a mechanism for intermixing documentation with source code.
	471	While it's expecting the beginning of a new statement, if the compiler
	472	encounters a line that begins with an equal sign and a word, like this
	473
	474	=head1 Here There Be Pods!
	475
	476	Then that text and all remaining text up through and including a line
	477	beginning with C<=cut> will be ignored. The format of the intervening
	478	text is described in L<perlpod>.
	479
	480	This allows you to intermix your source code
	481	and your documentation text freely, as in
	482
	483	=item snazzle($)
	484
	485	The snazzle() function will behave in the most spectacular
	486	form that you can possibly imagine, not even excepting
	487	cybernetic pyrotechnics.
	488
	489	=cut back to the compiler, nuff of this pod stuff!
	490
	491	sub snazzle($) {
	492	my $thingie = shift;
	493	.........
	494	}
	495
	496	Note that pod translators should look at only paragraphs beginning
	497	with a pod directive (it makes parsing easier), whereas the compiler
	498	actually knows to look for pod escapes even in the middle of a
	499	paragraph. This means that the following secret stuff will be
	500	ignored by both the compiler and the translators.
	501
	502	$a=3;
	503	=secret stuff
	504	warn "Neither POD nor CODE!?"
	505	=cut back
	506	print "got $a\n";
	507
	508	You probably shouldn't rely upon the warn() being podded out forever.
	509	Not all pod translators are well-behaved in this regard, and perhaps
	510	the compiler will become pickier.