Commit | Line | Data |
---|---|---|
a0d0e21e LW |
1 | =head1 NAME |
2 | ||
3 | perlstyle - Perl style guide | |
4 | ||
5 | =head1 DESCRIPTION | |
6 | ||
a0d0e21e LW |
7 | Each programmer will, of course, have his or her own preferences in |
8 | regards to formatting, but there are some general guidelines that will | |
9 | make your programs easier to read, understand, and maintain. | |
10 | ||
cb1a09d0 AD |
11 | The most important thing is to run your programs under the B<-w> |
12 | flag at all times. You may turn it off explicitly for particular | |
13 | portions of code via the C<$^W> variable if you must. You should | |
14 | also always run under C<use strict> or know the reason why not. | |
15 | The <use sigtrap> and even <use diagnostics> pragmas may also prove | |
16 | useful. | |
17 | ||
a0d0e21e LW |
18 | Regarding aesthetics of code lay out, about the only thing Larry |
19 | cares strongly about is that the closing curly brace of | |
20 | a multi-line BLOCK should line up with the keyword that started the construct. | |
21 | Beyond that, he has other preferences that aren't so strong: | |
22 | ||
23 | =over 4 | |
24 | ||
25 | =item * | |
26 | ||
27 | 4-column indent. | |
28 | ||
29 | =item * | |
30 | ||
31 | Opening curly on same line as keyword, if possible, otherwise line up. | |
32 | ||
33 | =item * | |
34 | ||
35 | Space before the opening curly of a multiline BLOCK. | |
36 | ||
37 | =item * | |
38 | ||
39 | One-line BLOCK may be put on one line, including curlies. | |
40 | ||
41 | =item * | |
42 | ||
43 | No space before the semicolon. | |
44 | ||
45 | =item * | |
46 | ||
47 | Semicolon omitted in "short" one-line BLOCK. | |
48 | ||
49 | =item * | |
50 | ||
51 | Space around most operators. | |
52 | ||
53 | =item * | |
54 | ||
55 | Space around a "complex" subscript (inside brackets). | |
56 | ||
57 | =item * | |
58 | ||
59 | Blank lines between chunks that do different things. | |
60 | ||
61 | =item * | |
62 | ||
63 | Uncuddled elses. | |
64 | ||
65 | =item * | |
66 | ||
67 | No space between function name and its opening paren. | |
68 | ||
69 | =item * | |
70 | ||
71 | Space after each comma. | |
72 | ||
73 | =item * | |
74 | ||
75 | Long lines broken after an operator (except "and" and "or"). | |
76 | ||
77 | =item * | |
78 | ||
79 | Space after last paren matching on current line. | |
80 | ||
81 | =item * | |
82 | ||
83 | Line up corresponding items vertically. | |
84 | ||
85 | =item * | |
86 | ||
87 | Omit redundant punctuation as long as clarity doesn't suffer. | |
88 | ||
89 | =back | |
90 | ||
91 | Larry has his reasons for each of these things, but he doen't claim that | |
92 | everyone else's mind works the same as his does. | |
93 | ||
94 | Here are some other more substantive style issues to think about: | |
95 | ||
96 | =over 4 | |
97 | ||
98 | =item * | |
99 | ||
100 | Just because you I<CAN> do something a particular way doesn't mean that | |
101 | you I<SHOULD> do it that way. Perl is designed to give you several | |
102 | ways to do anything, so consider picking the most readable one. For | |
103 | instance | |
104 | ||
105 | open(FOO,$foo) || die "Can't open $foo: $!"; | |
106 | ||
107 | is better than | |
108 | ||
109 | die "Can't open $foo: $!" unless open(FOO,$foo); | |
110 | ||
111 | because the second way hides the main point of the statement in a | |
112 | modifier. On the other hand | |
113 | ||
114 | print "Starting analysis\n" if $verbose; | |
115 | ||
116 | is better than | |
117 | ||
118 | $verbose && print "Starting analysis\n"; | |
119 | ||
120 | since the main point isn't whether the user typed B<-v> or not. | |
121 | ||
122 | Similarly, just because an operator lets you assume default arguments | |
123 | doesn't mean that you have to make use of the defaults. The defaults | |
124 | are there for lazy systems programmers writing one-shot programs. If | |
125 | you want your program to be readable, consider supplying the argument. | |
126 | ||
127 | Along the same lines, just because you I<CAN> omit parentheses in many | |
128 | places doesn't mean that you ought to: | |
129 | ||
130 | return print reverse sort num values %array; | |
131 | return print(reverse(sort num (values(%array)))); | |
132 | ||
133 | When in doubt, parenthesize. At the very least it will let some poor | |
134 | schmuck bounce on the % key in B<vi>. | |
135 | ||
136 | Even if you aren't in doubt, consider the mental welfare of the person | |
137 | who has to maintain the code after you, and who will probably put | |
138 | parens in the wrong place. | |
139 | ||
140 | =item * | |
141 | ||
142 | Don't go through silly contortions to exit a loop at the top or the | |
143 | bottom, when Perl provides the C<last> operator so you can exit in | |
144 | the middle. Just "outdent" it a little to make it more visible: | |
145 | ||
146 | LINE: | |
147 | for (;;) { | |
148 | statements; | |
149 | last LINE if $foo; | |
150 | next LINE if /^#/; | |
151 | statements; | |
152 | } | |
153 | ||
154 | =item * | |
155 | ||
156 | Don't be afraid to use loop labels--they're there to enhance | |
157 | readability as well as to allow multi-level loop breaks. See the | |
158 | previous example. | |
159 | ||
160 | =item * | |
161 | ||
162 | For portability, when using features that may not be implemented on | |
163 | every machine, test the construct in an eval to see if it fails. If | |
164 | you know what version or patchlevel a particular feature was | |
165 | implemented, you can test C<$]> ($PERL_VERSION in C<English>) to see if it | |
166 | will be there. The C<Config> module will also let you interrogate values | |
167 | determined by the B<Configure> program when Perl was installed. | |
168 | ||
169 | =item * | |
170 | ||
171 | Choose mnemonic identifiers. If you can't remember what mnemonic means, | |
172 | you've got a problem. | |
173 | ||
cb1a09d0 AD |
174 | =item * |
175 | ||
176 | While short identifiers like $gotit are probably ok, use underscores to | |
177 | separate words. It is generally easier to read $var_names_like_this than | |
178 | $VarNamesLikeThis, especially for non-native speakers of English. It's | |
179 | also a simple rule that works consistently with VAR_NAMES_LIKE_THIS. | |
180 | ||
181 | Package names are sometimes an exception to this rule. Perl informally | |
182 | reserves lowercase module names for "pragma" modules like C<integer> and | |
183 | C<strict>. Other modules should begin with a capital letter and use mixed | |
184 | case, but probably without underscores due to limitations in primitive | |
185 | filesystems' representations of module names as files that must fit into a | |
186 | few sparse bites. | |
187 | ||
188 | =item * | |
189 | ||
190 | You may find it helpful to use letter case to indicate the scope | |
191 | or nature of a variable. For example: | |
192 | ||
193 | $ALL_CAPS_HERE constants only (beware clashes with perl vars!) | |
194 | $Some_Caps_Here package-wide global/static | |
195 | $no_caps_here function scope my() or local() variables | |
196 | ||
197 | Function and method names seem to work best as all lowercase. | |
198 | E.g., $obj->as_string(). | |
199 | ||
200 | You can use a leading underscore to indicate that a variable or | |
201 | function should not be used outside the package that defined it. | |
202 | ||
a0d0e21e LW |
203 | =item * |
204 | ||
205 | If you have a really hairy regular expression, use the C</x> modifier and | |
206 | put in some whitespace to make it look a little less like line noise. | |
207 | Don't use slash as a delimiter when your regexp has slashes or backslashes. | |
208 | ||
209 | =item * | |
210 | ||
211 | Use the new "and" and "or" operators to avoid having to parenthesize | |
212 | list operators so much, and to reduce the incidence of punctuational | |
213 | operators like C<&&> and C<||>. Call your subroutines as if they were | |
214 | functions or list operators to avoid excessive ampersands and parens. | |
215 | ||
216 | =item * | |
217 | ||
218 | Use here documents instead of repeated print() statements. | |
219 | ||
220 | =item * | |
221 | ||
222 | Line up corresponding things vertically, especially if it'd be too long | |
223 | to fit on one line anyway. | |
224 | ||
225 | $IDX = $ST_MTIME; | |
226 | $IDX = $ST_ATIME if $opt_u; | |
227 | $IDX = $ST_CTIME if $opt_c; | |
228 | $IDX = $ST_SIZE if $opt_s; | |
229 | ||
230 | mkdir $tmpdir, 0700 or die "can't mkdir $tmpdir: $!"; | |
231 | chdir($tmpdir) or die "can't chdir $tmpdir: $!"; | |
232 | mkdir 'tmp', 0777 or die "can't mkdir $tmpdir/tmp: $!"; | |
233 | ||
234 | =item * | |
235 | ||
cb1a09d0 AD |
236 | Always check the return codes of system calls. Good error messages should |
237 | go to STDERR, include which program caused the problem, what the failed | |
238 | system call and arguments were, and VERY IMPORTANT) should contain the | |
239 | standard system error message for what went wrong. Here's a simple but | |
240 | sufficient example: | |
241 | ||
242 | opendir(D, $dir) or die "can't opendir $dir: $!"; | |
243 | ||
244 | =item * | |
245 | ||
a0d0e21e LW |
246 | Line up your translations when it makes sense: |
247 | ||
248 | tr [abc] | |
249 | [xyz]; | |
250 | ||
251 | =item * | |
252 | ||
253 | Think about reusability. Why waste brainpower on a one-shot when you | |
254 | might want to do something like it again? Consider generalizing your | |
255 | code. Consider writing a module or object class. Consider making your | |
256 | code run cleanly with C<use strict> and B<-w> in effect. Consider giving away | |
257 | your code. Consider changing your whole world view. Consider... oh, | |
258 | never mind. | |
259 | ||
260 | =item * | |
261 | ||
262 | Be consistent. | |
263 | ||
264 | =item * | |
265 | ||
266 | Be nice. | |
267 | ||
268 | =back |