[perl5.git] / pod / perlstyle.pod

=head1 NAME

perlstyle - Perl style guide

=head1 DESCRIPTION

Each programmer will, of course, have his or her own preferences in
regards to formatting, but there are some general guidelines that will
make your programs easier to read, understand, and maintain.  

The most important thing is to run your programs under the B<-w>
flag at all times.  You may turn it off explicitly for particular
portions of code via the C<$^W> variable if you must.  You should
also always run under C<use strict> or know the reason why not.
The <use sigtrap> and even <use diagnostics> pragmas may also prove
useful.

Regarding aesthetics of code lay out, about the only thing Larry
cares strongly about is that the closing curly brace of
a multi-line BLOCK should line up with the keyword that started the construct.
Beyond that, he has other preferences that aren't so strong:

=over 4

=item *

4-column indent.

=item *

Opening curly on same line as keyword, if possible, otherwise line up.

=item *

Space before the opening curly of a multiline BLOCK.

=item *

One-line BLOCK may be put on one line, including curlies.

=item *

No space before the semicolon.

=item *

Semicolon omitted in "short" one-line BLOCK.

=item *

Space around most operators.

=item *

Space around a "complex" subscript (inside brackets).

=item *

Blank lines between chunks that do different things.

=item *

Uncuddled elses.

=item *

No space between function name and its opening paren.

=item *

Space after each comma.

=item *

Long lines broken after an operator (except "and" and "or").

=item *

Space after last paren matching on current line.

=item *

Line up corresponding items vertically.

=item *

Omit redundant punctuation as long as clarity doesn't suffer.

=back

Larry has his reasons for each of these things, but he doen't claim that
everyone else's mind works the same as his does.

Here are some other more substantive style issues to think about:

=over 4

=item *

Just because you I<CAN> do something a particular way doesn't mean that
you I<SHOULD> do it that way.  Perl is designed to give you several
ways to do anything, so consider picking the most readable one.  For
instance

    open(FOO,$foo) || die "Can't open $foo: $!";

is better than

    die "Can't open $foo: $!" unless open(FOO,$foo);

because the second way hides the main point of the statement in a
modifier.  On the other hand

    print "Starting analysis\n" if $verbose;

is better than

    $verbose && print "Starting analysis\n";

since the main point isn't whether the user typed B<-v> or not.

Similarly, just because an operator lets you assume default arguments
doesn't mean that you have to make use of the defaults.  The defaults
are there for lazy systems programmers writing one-shot programs.  If
you want your program to be readable, consider supplying the argument.

Along the same lines, just because you I<CAN> omit parentheses in many
places doesn't mean that you ought to:

    return print reverse sort num values %array;
    return print(reverse(sort num (values(%array))));

When in doubt, parenthesize.  At the very least it will let some poor
schmuck bounce on the % key in B<vi>.

Even if you aren't in doubt, consider the mental welfare of the person
who has to maintain the code after you, and who will probably put
parens in the wrong place.

=item *

Don't go through silly contortions to exit a loop at the top or the
bottom, when Perl provides the C<last> operator so you can exit in
the middle.  Just "outdent" it a little to make it more visible:

    LINE:
	for (;;) {
	    statements;
	  last LINE if $foo;
	    next LINE if /^#/;
	    statements;
	}

=item *

Don't be afraid to use loop labels--they're there to enhance
readability as well as to allow multi-level loop breaks.  See the
previous example.

=item *

For portability, when using features that may not be implemented on
every machine, test the construct in an eval to see if it fails.  If
you know what version or patchlevel a particular feature was
implemented, you can test C<$]> ($PERL_VERSION in C<English>) to see if it
will be there.  The C<Config> module will also let you interrogate values
determined by the B<Configure> program when Perl was installed.

=item *

Choose mnemonic identifiers.  If you can't remember what mnemonic means,
you've got a problem.

=item * 

While short identifiers like $gotit are probably ok, use underscores to
separate words.  It is generally easier to read $var_names_like_this than
$VarNamesLikeThis, especially for non-native speakers of English. It's
also a simple rule that works consistently with VAR_NAMES_LIKE_THIS.

Package names are sometimes an exception to this rule.  Perl informally
reserves lowercase module names for "pragma" modules like C<integer> and
C<strict>.  Other modules should begin with a capital letter and use mixed
case, but probably without underscores due to limitations in primitive
filesystems' representations of module names as files that must fit into a
few sparse bites.

=item *

You may find it helpful to use letter case to indicate the scope 
or nature of a variable. For example: 

    $ALL_CAPS_HERE   constants only (beware clashes with perl vars!)  
    $Some_Caps_Here  package-wide global/static 
    $no_caps_here    function scope my() or local() variables 

Function and method names seem to work best as all lowercase. 
E.g., $obj->as_string(). 

You can use a leading underscore to indicate that a variable or
function should not be used outside the package that defined it.

=item *

If you have a really hairy regular expression, use the C</x> modifier and
put in some whitespace to make it look a little less like line noise.
Don't use slash as a delimiter when your regexp has slashes or backslashes.

=item *

Use the new "and" and "or" operators to avoid having to parenthesize
list operators so much, and to reduce the incidence of punctuational
operators like C<&&> and C<||>.  Call your subroutines as if they were
functions or list operators to avoid excessive ampersands and parens.

=item *

Use here documents instead of repeated print() statements.

=item *

Line up corresponding things vertically, especially if it'd be too long
to fit on one line anyway.  

    $IDX = $ST_MTIME;       
    $IDX = $ST_ATIME 	   if $opt_u; 
    $IDX = $ST_CTIME 	   if $opt_c;     
    $IDX = $ST_SIZE  	   if $opt_s;     

    mkdir $tmpdir, 0700	or die "can't mkdir $tmpdir: $!";
    chdir($tmpdir)      or die "can't chdir $tmpdir: $!";
    mkdir 'tmp',   0777	or die "can't mkdir $tmpdir/tmp: $!";

=item *

Always check the return codes of system calls.  Good error messages should
go to STDERR, include which program caused the problem, what the failed
system call and arguments were, and VERY IMPORTANT) should contain the
standard system error message for what went wrong.  Here's a simple but
sufficient example:

    opendir(D, $dir)	 or die "can't opendir $dir: $!";

=item *

Line up your translations when it makes sense:

    tr [abc]
       [xyz];

=item *

Think about reusability.  Why waste brainpower on a one-shot when you
might want to do something like it again?  Consider generalizing your
code.  Consider writing a module or object class.  Consider making your
code run cleanly with C<use strict> and B<-w> in effect.  Consider giving away
your code.  Consider changing your whole world view.  Consider... oh,
never mind.

=item *

Be consistent.

=item *

Be nice.

=back
Commit	Line	Data
a0d0e21e LW	1	=head1 NAME
	2
	3	perlstyle - Perl style guide
	4
	5	=head1 DESCRIPTION
	6
a0d0e21e LW	7	Each programmer will, of course, have his or her own preferences in
	8	regards to formatting, but there are some general guidelines that will
	9	make your programs easier to read, understand, and maintain.
	10
cb1a09d0 AD	11	The most important thing is to run your programs under the B<-w>
	12	flag at all times. You may turn it off explicitly for particular
	13	portions of code via the C<$^W> variable if you must. You should
	14	also always run under C<use strict> or know the reason why not.
	15	The <use sigtrap> and even <use diagnostics> pragmas may also prove
	16	useful.
	17
a0d0e21e LW	18	Regarding aesthetics of code lay out, about the only thing Larry
	19	cares strongly about is that the closing curly brace of
	20	a multi-line BLOCK should line up with the keyword that started the construct.
	21	Beyond that, he has other preferences that aren't so strong:
	22
	23	=over 4
	24
	25	=item *
	26
	27	4-column indent.
	28
	29	=item *
	30
	31	Opening curly on same line as keyword, if possible, otherwise line up.
	32
	33	=item *
	34
	35	Space before the opening curly of a multiline BLOCK.
	36
	37	=item *
	38
	39	One-line BLOCK may be put on one line, including curlies.
	40
	41	=item *
	42
	43	No space before the semicolon.
	44
	45	=item *
	46
	47	Semicolon omitted in "short" one-line BLOCK.
	48
	49	=item *
	50
	51	Space around most operators.
	52
	53	=item *
	54
	55	Space around a "complex" subscript (inside brackets).
	56
	57	=item *
	58
	59	Blank lines between chunks that do different things.
	60
	61	=item *
	62
	63	Uncuddled elses.
	64
	65	=item *
	66
	67	No space between function name and its opening paren.
	68
	69	=item *
	70
	71	Space after each comma.
	72
	73	=item *
	74
	75	Long lines broken after an operator (except "and" and "or").
	76
	77	=item *
	78
	79	Space after last paren matching on current line.
	80
	81	=item *
82
83	Line up corresponding items vertically.
84
85	=item *
86
87	Omit redundant punctuation as long as clarity doesn't suffer.
88
89	=back
90
91	Larry has his reasons for each of these things, but he doen't claim that
92	everyone else's mind works the same as his does.
93
94	Here are some other more substantive style issues to think about:
95
96	=over 4
97
98	=item *
99
100	Just because you I<CAN> do something a particular way doesn't mean that
101	you I<SHOULD> do it that way. Perl is designed to give you several
102	ways to do anything, so consider picking the most readable one. For
103	instance
104
105	open(FOO,$foo) \|\| die "Can't open $foo: $!";
106
107	is better than
108
109	die "Can't open $foo: $!" unless open(FOO,$foo);
110
111	because the second way hides the main point of the statement in a
112	modifier. On the other hand
113
114	print "Starting analysis\n" if $verbose;
115
116	is better than
117
118	$verbose && print "Starting analysis\n";
119
120	since the main point isn't whether the user typed B<-v> or not.
121
122	Similarly, just because an operator lets you assume default arguments
123	doesn't mean that you have to make use of the defaults. The defaults
124	are there for lazy systems programmers writing one-shot programs. If
125	you want your program to be readable, consider supplying the argument.
126
127	Along the same lines, just because you I<CAN> omit parentheses in many
128	places doesn't mean that you ought to:
129
130	return print reverse sort num values %array;
131	return print(reverse(sort num (values(%array))));
132
133	When in doubt, parenthesize. At the very least it will let some poor
134	schmuck bounce on the % key in B<vi>.
135
136	Even if you aren't in doubt, consider the mental welfare of the person
137	who has to maintain the code after you, and who will probably put
138	parens in the wrong place.
139
140	=item *
141
142	Don't go through silly contortions to exit a loop at the top or the
143	bottom, when Perl provides the C<last> operator so you can exit in
144	the middle. Just "outdent" it a little to make it more visible:
145
146	LINE:
147	for (;;) {
148	statements;
149	last LINE if $foo;
150	next LINE if /^#/;
151	statements;
152	}
153
154	=item *
155
156	Don't be afraid to use loop labels--they're there to enhance
157	readability as well as to allow multi-level loop breaks. See the
158	previous example.
159
160	=item *
161
162	For portability, when using features that may not be implemented on
163	every machine, test the construct in an eval to see if it fails. If
164	you know what version or patchlevel a particular feature was
165	implemented, you can test C<$]> ($PERL_VERSION in C<English>) to see if it
166	will be there. The C<Config> module will also let you interrogate values
167	determined by the B<Configure> program when Perl was installed.
168
169	=item *
170
171	Choose mnemonic identifiers. If you can't remember what mnemonic means,
172	you've got a problem.
173
cb1a09d0 AD	174	=item *
	175
	176	While short identifiers like $gotit are probably ok, use underscores to
	177	separate words. It is generally easier to read $var_names_like_this than
	178	$VarNamesLikeThis, especially for non-native speakers of English. It's
	179	also a simple rule that works consistently with VAR_NAMES_LIKE_THIS.
	180
	181	Package names are sometimes an exception to this rule. Perl informally
	182	reserves lowercase module names for "pragma" modules like C<integer> and
	183	C<strict>. Other modules should begin with a capital letter and use mixed
	184	case, but probably without underscores due to limitations in primitive
	185	filesystems' representations of module names as files that must fit into a
	186	few sparse bites.
	187
	188	=item *
	189
	190	You may find it helpful to use letter case to indicate the scope
	191	or nature of a variable. For example:
	192
	193	$ALL_CAPS_HERE constants only (beware clashes with perl vars!)
	194	$Some_Caps_Here package-wide global/static
	195	$no_caps_here function scope my() or local() variables
	196
	197	Function and method names seem to work best as all lowercase.
	198	E.g., $obj->as_string().
	199
	200	You can use a leading underscore to indicate that a variable or
	201	function should not be used outside the package that defined it.
	202
a0d0e21e LW	203	=item *
	204
	205	If you have a really hairy regular expression, use the C</x> modifier and
	206	put in some whitespace to make it look a little less like line noise.
	207	Don't use slash as a delimiter when your regexp has slashes or backslashes.
	208
	209	=item *
	210
	211	Use the new "and" and "or" operators to avoid having to parenthesize
	212	list operators so much, and to reduce the incidence of punctuational
	213	operators like C<&&> and C<\|\|>. Call your subroutines as if they were
	214	functions or list operators to avoid excessive ampersands and parens.
	215
	216	=item *
	217
	218	Use here documents instead of repeated print() statements.
	219
	220	=item *
	221
	222	Line up corresponding things vertically, especially if it'd be too long
	223	to fit on one line anyway.
	224
	225	$IDX = $ST_MTIME;
	226	$IDX = $ST_ATIME if $opt_u;
	227	$IDX = $ST_CTIME if $opt_c;
	228	$IDX = $ST_SIZE if $opt_s;
	229
	230	mkdir $tmpdir, 0700 or die "can't mkdir $tmpdir: $!";
	231	chdir($tmpdir) or die "can't chdir $tmpdir: $!";
	232	mkdir 'tmp', 0777 or die "can't mkdir $tmpdir/tmp: $!";
	233
	234	=item *
	235
cb1a09d0 AD	236	Always check the return codes of system calls. Good error messages should
	237	go to STDERR, include which program caused the problem, what the failed
	238	system call and arguments were, and VERY IMPORTANT) should contain the
	239	standard system error message for what went wrong. Here's a simple but
	240	sufficient example:
	241
	242	opendir(D, $dir) or die "can't opendir $dir: $!";
	243
	244	=item *
	245
a0d0e21e LW	246	Line up your translations when it makes sense:
	247
	248	tr [abc]
	249	[xyz];
	250
	251	=item *
	252
	253	Think about reusability. Why waste brainpower on a one-shot when you
	254	might want to do something like it again? Consider generalizing your
	255	code. Consider writing a module or object class. Consider making your
	256	code run cleanly with C<use strict> and B<-w> in effect. Consider giving away
	257	your code. Consider changing your whole world view. Consider... oh,
	258	never mind.
	259
	260	=item *
	261
	262	Be consistent.
	263
	264	=item *
	265
	266	Be nice.
	267
	268	=back