Commit | Line | Data |
---|---|---|
a0d0e21e LW |
1 | =head1 NAME |
2 | ||
3 | perl - Practical Extraction and Report Language | |
4 | ||
5 | =head1 SYNOPSIS | |
6 | ||
7 | For ease of access, the Perl manual has been split up into a number | |
8 | of sections: | |
9 | ||
10 | perl Perl overview (this section) | |
11 | perldata Perl data structures | |
12 | perlsyn Perl syntax | |
13 | perlop Perl operators and precedence | |
14 | perlre Perl regular expressions | |
15 | perlrun Perl execution and options | |
16 | perlfunc Perl builtin functions | |
17 | perlvar Perl predefined variables | |
18 | perlsub Perl subroutines | |
19 | perlmod Perl modules | |
20 | perlref Perl references and nested data structures | |
21 | perlobj Perl objects | |
22 | perlbot Perl OO tricks and examples | |
23 | perldebug Perl debugging | |
24 | perldiag Perl diagnostic messages | |
25 | perlform Perl formats | |
26 | perlipc Perl interprocess communication | |
27 | perlsec Perl security | |
28 | perltrap Perl traps for the unwary | |
29 | perlstyle Perl style guide | |
30 | perlapi Perl application programming interface | |
31 | perlguts Perl internal functions for those doing extensions | |
32 | perlcall Perl calling conventions from C | |
33 | perlovl Perl overloading semantics | |
34 | perlbook Perl book information | |
35 | ||
36 | (If you're intending to read these straight through for the first time, | |
37 | the suggested order will tend to reduce the number of forward references.) | |
38 | ||
39 | If something strange has gone wrong with your program and you're not | |
40 | sure where you should look for help, try the B<-w> switch first. It | |
41 | will often point out exactly where the trouble is. | |
42 | ||
43 | =head1 DESCRIPTION | |
44 | ||
45 | Perl is an interpreted language optimized for scanning arbitrary | |
46 | text files, extracting information from those text files, and printing | |
47 | reports based on that information. It's also a good language for many | |
48 | system management tasks. The language is intended to be practical | |
49 | (easy to use, efficient, complete) rather than beautiful (tiny, | |
50 | elegant, minimal). It combines (in the author's opinion, anyway) some | |
51 | of the best features of C, B<sed>, B<awk>, and B<sh>, so people | |
52 | familiar with those languages should have little difficulty with it. | |
53 | (Language historians will also note some vestiges of B<csh>, Pascal, | |
54 | and even BASIC-PLUS.) Expression syntax corresponds quite closely to C | |
55 | expression syntax. Unlike most Unix utilities, Perl does not | |
56 | arbitrarily limit the size of your data--if you've got the memory, | |
57 | Perl can slurp in your whole file as a single string. Recursion is | |
58 | of unlimited depth. And the hash tables used by associative arrays | |
59 | grow as necessary to prevent degraded performance. Perl uses | |
60 | sophisticated pattern matching techniques to scan large amounts of data | |
61 | very quickly. Although optimized for scanning text, Perl can also | |
62 | deal with binary data, and can make dbm files look like associative | |
63 | arrays (where dbm is available). Setuid Perl scripts are safer than | |
64 | C programs through a dataflow tracing mechanism which prevents many | |
65 | stupid security holes. If you have a problem that would ordinarily use | |
66 | B<sed> or B<awk> or B<sh>, but it exceeds their capabilities or must | |
67 | run a little faster, and you don't want to write the silly thing in C, | |
68 | then Perl may be for you. There are also translators to turn your | |
69 | B<sed> and B<awk> scripts into Perl scripts. | |
70 | ||
71 | But wait, there's more... | |
72 | ||
73 | Perl version 5 is nearly a complete rewrite, and provides | |
74 | the following additional benefits: | |
75 | ||
76 | =over 5 | |
77 | ||
78 | =item * Many usability enhancements | |
79 | ||
80 | It is now possible to write much more readable Perl code (even within | |
81 | regular expressions). Formerly cryptic variable names can be replaced | |
82 | by mnemonic identifiers. Error messages are more informative, and the | |
83 | optional warnings will catch many of the mistakes a novice might make. | |
84 | This cannot be stressed enough. Whenever you get mysterious behavior, | |
85 | try the B<-w> switch!!! Whenever you don't get mysterious behavior, | |
86 | try using B<-w> anyway. | |
87 | ||
88 | =item * Simplified grammar | |
89 | ||
90 | The new yacc grammar is one half the size of the old one. Many of the | |
91 | arbitrary grammar rules have been regularized. The number of reserved | |
92 | words has been cut by 2/3. Despite this, nearly all old Perl scripts | |
93 | will continue to work unchanged. | |
94 | ||
95 | =item * Lexical scoping | |
96 | ||
97 | Perl variables may now be declared within a lexical scope, like "auto" | |
98 | variables in C. Not only is this more efficient, but it contributes | |
99 | to better privacy for "programming in the large". | |
100 | ||
101 | =item * Arbitrarily nested data structures | |
102 | ||
103 | Any scalar value, including any array element, may now contain a | |
104 | reference to any other variable or subroutine. You can easily create | |
105 | anonymous variables and subroutines. Perl manages your reference | |
106 | counts for you. | |
107 | ||
108 | =item * Modularity and reusability | |
109 | ||
110 | The Perl library is now defined in terms of modules which can be easily | |
111 | shared among various packages. A package may choose to import all or a | |
112 | portion of a module's published interface. Pragmas (that is, compiler | |
113 | directives) are defined and used by the same mechanism. | |
114 | ||
115 | =item * Object-oriented programming | |
116 | ||
117 | A package can function as a class. Dynamic multiple inheritance and | |
118 | virtual methods are supported in a straightforward manner and with very | |
119 | little new syntax. Filehandles may now be treated as objects. | |
120 | ||
121 | =item * Embeddible and Extensible | |
122 | ||
123 | Perl may now be embedded easily in your C or C++ application, and can | |
124 | either call or be called by your routines through a documented | |
125 | interface. The XS preprocessor is provided to make it easy to glue | |
126 | your C or C++ routines into Perl. Dynamic loading of modules is | |
127 | supported. | |
128 | ||
129 | =item * POSIX compliant | |
130 | ||
131 | A major new module is the POSIX module, which provides access to all | |
132 | available POSIX routines and definitions, via object classes where | |
133 | appropriate. | |
134 | ||
135 | =item * Package constructors and destructors | |
136 | ||
137 | The new BEGIN and END blocks provide means to capture control as | |
138 | a package is being compiled, and after the program exits. As a | |
139 | degenerate case they work just like awk's BEGIN and END when you | |
140 | use the B<-p> or B<-n> switches. | |
141 | ||
142 | =item * Multiple simultaneous DBM implementations | |
143 | ||
144 | A Perl program may now access DBM, NDBM, SDBM, GDBM, and Berkeley DB | |
145 | files from the same script simultaneously. In fact, the old dbmopen | |
146 | interface has been generalized to allow any variable to be tied | |
147 | to an object class which defines its access methods. | |
148 | ||
149 | =item * Subroutine definitions may now be autoloaded | |
150 | ||
151 | In fact, the AUTOLOAD mechanism also allows you to define any arbitrary | |
152 | semantics for undefined subroutine calls. It's not just for autoloading. | |
153 | ||
154 | =item * Regular expression enhancements | |
155 | ||
156 | You can now specify non-greedy quantifiers. You can now do grouping | |
157 | without creating a backreference. You can now write regular expressions | |
158 | with embedded whitespace and comments for readability. A consistent | |
159 | extensibility mechanism has been added that is upwardly compatible with | |
160 | all old regular expressions. | |
161 | ||
162 | =back | |
163 | ||
164 | Ok, that's I<definitely> enough hype. | |
165 | ||
166 | =head1 ENVIRONMENT | |
167 | ||
168 | =over 12 | |
169 | ||
170 | =item HOME | |
171 | ||
172 | Used if chdir has no argument. | |
173 | ||
174 | =item LOGDIR | |
175 | ||
176 | Used if chdir has no argument and HOME is not set. | |
177 | ||
178 | =item PATH | |
179 | ||
180 | Used in executing subprocesses, and in finding the script if B<-S> is | |
181 | used. | |
182 | ||
183 | =item PERL5LIB | |
184 | ||
185 | A colon-separated list of directories in which to look for Perl library | |
186 | files before looking in the standard library and the current | |
187 | directory. If PERL5LIB is not defined, PERLLIB is used. | |
188 | ||
189 | =item PERL5DB | |
190 | ||
191 | The command used to get the debugger code. If unset, uses | |
192 | ||
193 | BEGIN { require 'perl5db.pl' } | |
194 | ||
195 | =item PERLLIB | |
196 | ||
197 | A colon-separated list of directories in which to look for Perl library | |
198 | files before looking in the standard library and the current | |
199 | directory. If PERL5LIB is defined, PERLLIB is not used. | |
200 | ||
201 | ||
202 | =back | |
203 | ||
204 | Apart from these, Perl uses no other environment variables, except | |
205 | to make them available to the script being executed, and to child | |
206 | processes. However, scripts running setuid would do well to execute | |
207 | the following lines before doing anything else, just to keep people | |
208 | honest: | |
209 | ||
210 | $ENV{'PATH'} = '/bin:/usr/bin'; # or whatever you need | |
211 | $ENV{'SHELL'} = '/bin/sh' if defined $ENV{'SHELL'}; | |
212 | $ENV{'IFS'} = '' if defined $ENV{'IFS'}; | |
213 | ||
214 | =head1 AUTHOR | |
215 | ||
216 | Larry Wall <F<lwall@netlabs.com.>, with the help of oodles of other folks. | |
217 | ||
218 | =head1 FILES | |
219 | ||
220 | "/tmp/perl-e$$" temporary file for -e commands | |
221 | "@INC" locations of perl 5 libraries | |
222 | ||
223 | =head1 SEE ALSO | |
224 | ||
225 | a2p awk to perl translator | |
226 | s2p sed to perl translator | |
227 | ||
228 | =head1 DIAGNOSTICS | |
229 | ||
230 | The B<-w> switch produces some lovely diagnostics. | |
231 | ||
232 | See L<perldiag> for explanations of all Perl's diagnostics. | |
233 | ||
234 | Compilation errors will tell you the line number of the error, with an | |
235 | indication of the next token or token type that was to be examined. | |
236 | (In the case of a script passed to Perl via B<-e> switches, each | |
237 | B<-e> is counted as one line.) | |
238 | ||
239 | Setuid scripts have additional constraints that can produce error | |
240 | messages such as "Insecure dependency". See L<perlsec>. | |
241 | ||
242 | Did we mention that you should definitely consider using the B<-w> | |
243 | switch? | |
244 | ||
245 | =head1 BUGS | |
246 | ||
247 | The B<-w> switch is not mandatory. | |
248 | ||
249 | Perl is at the mercy of your machine's definitions of various | |
250 | operations such as type casting, atof() and sprintf(). | |
251 | ||
252 | If your stdio requires an seek or eof between reads and writes on a | |
253 | particular stream, so does Perl. (This doesn't apply to sysread() | |
254 | and syswrite().) | |
255 | ||
256 | While none of the built-in data types have any arbitrary size limits | |
257 | (apart from memory size), there are still a few arbitrary limits: a | |
258 | given identifier may not be longer than 255 characters, and no | |
259 | component of your PATH may be longer than 255 if you use B<-S>. A regular | |
260 | expression may not compile to more than 32767 bytes internally. | |
261 | ||
262 | Perl actually stands for Pathologically Eclectic Rubbish Lister, but | |
263 | don't tell anyone I said that. | |
264 | ||
265 | =head1 NOTES | |
266 | ||
267 | The Perl motto is "There's more than one way to do it." Divining | |
268 | how many more is left as an exercise to the reader. | |
269 | ||
270 | The three principle virtues of a programmer are Laziness, | |
271 | Impatience, and Hubris. See the Camel Book for why. |