Commit | Line | Data |
---|---|---|
bfe16a1a JH |
1 | =head1 NAME |
2 | ||
3 | perlintro -- a brief introduction and overview of Perl | |
4 | ||
5 | =head1 DESCRIPTION | |
6 | ||
7 | This document is intended to give you a quick overview of the Perl | |
8 | programming language, along with pointers to further documentation. It | |
9 | is intended as a "bootstrap" guide for those who are new to the | |
10 | language, and provides just enough information for you to be able to | |
11 | read other peoples' Perl and understand roughly what it's doing, or | |
12 | write your own simple scripts. | |
13 | ||
14 | This introductory document does not aim to be complete. It does not | |
15 | even aim to be entirely accurate. In some cases perfection has been | |
16 | sacrificed in the goal of getting the general idea across. You are | |
98fcdafd | 17 | I<strongly> advised to follow this introduction with more information |
bfe16a1a JH |
18 | from the full Perl manual, the table of contents to which can be found |
19 | in L<perltoc>. | |
20 | ||
41489bc0 | 21 | Throughout this document you'll see references to other parts of the |
bfe16a1a | 22 | Perl documentation. You can read that documentation using the C<perldoc> |
98fcdafd | 23 | command or whatever method you're using to read this document. |
bfe16a1a | 24 | |
883fe895 JV |
25 | Throughout Perl's documentation, you'll find numerous examples intended |
26 | to help explain the discussed features. Please keep in mind that many | |
27 | of them are code fragments rather than complete programs. | |
28 | ||
29 | These examples often reflect the style and preference of the author of | |
30 | that piece of the documentation, and may be briefer than a corresponding | |
31 | line of code in a real program. Except where otherwise noted, you | |
32 | should assume that C<use strict> and C<use warnings> statements | |
33 | appear earlier in the "program", and that any variables used have | |
34 | already been declared, even if those declarations have been omitted | |
35 | to make the example easier to read. | |
36 | ||
37 | Do note that the examples have been written by many different authors over | |
e8b5ae53 | 38 | a period of several decades. Styles and techniques will therefore differ, |
883fe895 | 39 | although some effort has been made to not vary styles too widely in the |
ae4e3076 RS |
40 | same sections. Do not consider one style to be better than others - "There's |
41 | More Than One Way To Do It" is one of Perl's mottos. After all, in your | |
883fe895 | 42 | journey as a programmer, you are likely to encounter different styles. |
4f7b1295 | 43 | |
bfe16a1a JH |
44 | =head2 What is Perl? |
45 | ||
41489bc0 GS |
46 | Perl is a general-purpose programming language originally developed for |
47 | text manipulation and now used for a wide range of tasks including | |
48 | system administration, web development, network programming, GUI | |
bfe16a1a JH |
49 | development, and more. |
50 | ||
98fcdafd AT |
51 | The language is intended to be practical (easy to use, efficient, |
52 | complete) rather than beautiful (tiny, elegant, minimal). Its major | |
53 | features are that it's easy to use, supports both procedural and | |
54 | object-oriented (OO) programming, has powerful built-in support for text | |
55 | processing, and has one of the world's most impressive collections of | |
56 | third-party modules. | |
bfe16a1a | 57 | |
41489bc0 GS |
58 | Different definitions of Perl are given in L<perl>, L<perlfaq1> and |
59 | no doubt other places. From this we can determine that Perl is different | |
bfe16a1a JH |
60 | things to different people, but that lots of people think it's at least |
61 | worth writing about. | |
62 | ||
63 | =head2 Running Perl programs | |
64 | ||
65 | To run a Perl program from the Unix command line: | |
66 | ||
0b00a6f7 | 67 | perl progname.pl |
bfe16a1a JH |
68 | |
69 | Alternatively, put this as the first line of your script: | |
70 | ||
0b00a6f7 | 71 | #!/usr/bin/env perl |
bfe16a1a | 72 | |
f703fc96 | 73 | ... and run the script as F</path/to/script.pl>. Of course, it'll need |
bfe16a1a JH |
74 | to be executable first, so C<chmod 755 script.pl> (under Unix). |
75 | ||
e8b5ae53 | 76 | (This start line assumes you have the B<env> program. You can also put |
51370f99 RGS |
77 | directly the path to your perl executable, like in C<#!/usr/bin/perl>). |
78 | ||
bfe16a1a | 79 | For more information, including instructions for other platforms such as |
8939ba94 | 80 | Windows and Mac OS, read L<perlrun>. |
bfe16a1a | 81 | |
41489bc0 GS |
82 | =head2 Safety net |
83 | ||
e8b5ae53 | 84 | Perl by default is very forgiving. In order to make it more robust |
d699aa5b | 85 | it is recommended to start every program with the following lines: |
41489bc0 | 86 | |
0b00a6f7 KW |
87 | #!/usr/bin/perl |
88 | use strict; | |
89 | use warnings; | |
41489bc0 | 90 | |
64446524 | 91 | The two additional lines request from perl to catch various common |
e8b5ae53 | 92 | problems in your code. They check different things so you need both. A |
64446524 RGS |
93 | potential problem caught by C<use strict;> will cause your code to stop |
94 | immediately when it is encountered, while C<use warnings;> will merely | |
95 | give a warning (like the command-line switch B<-w>) and let your code run. | |
96 | To read more about them check their respective manual pages at L<strict> | |
97 | and L<warnings>. | |
41489bc0 | 98 | |
bfe16a1a JH |
99 | =head2 Basic syntax overview |
100 | ||
101 | A Perl script or program consists of one or more statements. These | |
102 | statements are simply written in the script in a straightforward | |
98fcdafd AT |
103 | fashion. There is no need to have a C<main()> function or anything of |
104 | that kind. | |
bfe16a1a JH |
105 | |
106 | Perl statements end in a semi-colon: | |
107 | ||
0b00a6f7 | 108 | print "Hello, world"; |
bfe16a1a JH |
109 | |
110 | Comments start with a hash symbol and run to the end of the line | |
111 | ||
0b00a6f7 | 112 | # This is a comment |
bfe16a1a JH |
113 | |
114 | Whitespace is irrelevant: | |
115 | ||
0b00a6f7 KW |
116 | |
117 | "Hello, world" | |
118 | ; | |
bfe16a1a JH |
119 | |
120 | ... except inside quoted strings: | |
121 | ||
0b00a6f7 KW |
122 | # this would print with a linebreak in the middle |
123 | print "Hello | |
124 | world"; | |
bfe16a1a JH |
125 | |
126 | Double quotes or single quotes may be used around literal strings: | |
127 | ||
0b00a6f7 KW |
128 | print "Hello, world"; |
129 | print 'Hello, world'; | |
bfe16a1a JH |
130 | |
131 | However, only double quotes "interpolate" variables and special | |
132 | characters such as newlines (C<\n>): | |
133 | ||
0b00a6f7 KW |
134 | print "Hello, $name\n"; # works fine |
135 | print 'Hello, $name\n'; # prints $name\n literally | |
bfe16a1a JH |
136 | |
137 | Numbers don't need quotes around them: | |
138 | ||
0b00a6f7 | 139 | print 42; |
bfe16a1a JH |
140 | |
141 | You can use parentheses for functions' arguments or omit them | |
41489bc0 | 142 | according to your personal taste. They are only required |
bfe16a1a JH |
143 | occasionally to clarify issues of precedence. |
144 | ||
0b00a6f7 KW |
145 | print("Hello, world\n"); |
146 | print "Hello, world\n"; | |
bfe16a1a JH |
147 | |
148 | More detailed information about Perl syntax can be found in L<perlsyn>. | |
149 | ||
150 | =head2 Perl variable types | |
151 | ||
152 | Perl has three main variable types: scalars, arrays, and hashes. | |
153 | ||
154 | =over 4 | |
155 | ||
156 | =item Scalars | |
157 | ||
158 | A scalar represents a single value: | |
159 | ||
0b00a6f7 KW |
160 | my $animal = "camel"; |
161 | my $answer = 42; | |
bfe16a1a | 162 | |
41489bc0 GS |
163 | Scalar values can be strings, integers or floating point numbers, and Perl |
164 | will automatically convert between them as required. There is no need | |
165 | to pre-declare your variable types, but you have to declare them using | |
e8b5ae53 | 166 | the C<my> keyword the first time you use them. (This is one of the |
41489bc0 | 167 | requirements of C<use strict;>.) |
bfe16a1a JH |
168 | |
169 | Scalar values can be used in various ways: | |
170 | ||
0b00a6f7 KW |
171 | print $animal; |
172 | print "The animal is $animal\n"; | |
173 | print "The square of $answer is ", $answer * $answer, "\n"; | |
bfe16a1a JH |
174 | |
175 | There are a number of "magic" scalars with names that look like | |
176 | punctuation or line noise. These special variables are used for all | |
177 | kinds of purposes, and are documented in L<perlvar>. The only one you | |
178 | need to know about for now is C<$_> which is the "default variable". | |
179 | It's used as the default argument to a number of functions in Perl, and | |
41489bc0 | 180 | it's set implicitly by certain looping constructs. |
bfe16a1a | 181 | |
0b00a6f7 | 182 | print; # prints contents of $_ by default |
bfe16a1a JH |
183 | |
184 | =item Arrays | |
185 | ||
186 | An array represents a list of values: | |
187 | ||
0b00a6f7 KW |
188 | my @animals = ("camel", "llama", "owl"); |
189 | my @numbers = (23, 42, 69); | |
190 | my @mixed = ("camel", 42, 1.23); | |
bfe16a1a JH |
191 | |
192 | Arrays are zero-indexed. Here's how you get at elements in an array: | |
193 | ||
0b00a6f7 KW |
194 | print $animals[0]; # prints "camel" |
195 | print $animals[1]; # prints "llama" | |
bfe16a1a | 196 | |
41489bc0 | 197 | The special variable C<$#array> tells you the index of the last element |
bfe16a1a JH |
198 | of an array: |
199 | ||
0b00a6f7 | 200 | print $mixed[$#mixed]; # last element, prints 1.23 |
bfe16a1a | 201 | |
41489bc0 | 202 | You might be tempted to use C<$#array + 1> to tell you how many items there |
bfe16a1a JH |
203 | are in an array. Don't bother. As it happens, using C<@array> where Perl |
204 | expects to find a scalar value ("in scalar context") will give you the number | |
205 | of elements in the array: | |
206 | ||
0b00a6f7 | 207 | if (@animals < 5) { ... } |
bfe16a1a | 208 | |
41489bc0 | 209 | The elements we're getting from the array start with a C<$> because |
ac036724 | 210 | we're getting just a single value out of the array; you ask for a scalar, |
bfe16a1a JH |
211 | you get a scalar. |
212 | ||
d1be9408 | 213 | To get multiple values from an array: |
bfe16a1a | 214 | |
0b00a6f7 KW |
215 | @animals[0,1]; # gives ("camel", "llama"); |
216 | @animals[0..2]; # gives ("camel", "llama", "owl"); | |
217 | @animals[1..$#animals]; # gives all except the first element | |
bfe16a1a JH |
218 | |
219 | This is called an "array slice". | |
220 | ||
221 | You can do various useful things to lists: | |
222 | ||
0b00a6f7 KW |
223 | my @sorted = sort @animals; |
224 | my @backwards = reverse @numbers; | |
bfe16a1a JH |
225 | |
226 | There are a couple of special arrays too, such as C<@ARGV> (the command | |
227 | line arguments to your script) and C<@_> (the arguments passed to a | |
228 | subroutine). These are documented in L<perlvar>. | |
229 | ||
230 | =item Hashes | |
231 | ||
232 | A hash represents a set of key/value pairs: | |
233 | ||
0b00a6f7 | 234 | my %fruit_color = ("apple", "red", "banana", "yellow"); |
bfe16a1a JH |
235 | |
236 | You can use whitespace and the C<< => >> operator to lay them out more | |
237 | nicely: | |
238 | ||
0b00a6f7 KW |
239 | my %fruit_color = ( |
240 | apple => "red", | |
241 | banana => "yellow", | |
242 | ); | |
bfe16a1a JH |
243 | |
244 | To get at hash elements: | |
245 | ||
0b00a6f7 | 246 | $fruit_color{"apple"}; # gives "red" |
bfe16a1a JH |
247 | |
248 | You can get at lists of keys and values with C<keys()> and | |
249 | C<values()>. | |
250 | ||
0b00a6f7 KW |
251 | my @fruits = keys %fruit_colors; |
252 | my @colors = values %fruit_colors; | |
bfe16a1a JH |
253 | |
254 | Hashes have no particular internal order, though you can sort the keys | |
255 | and loop through them. | |
256 | ||
41489bc0 | 257 | Just like special scalars and arrays, there are also special hashes. |
bfe16a1a JH |
258 | The most well known of these is C<%ENV> which contains environment |
259 | variables. Read all about it (and other special variables) in | |
260 | L<perlvar>. | |
261 | ||
262 | =back | |
263 | ||
264 | Scalars, arrays and hashes are documented more fully in L<perldata>. | |
265 | ||
266 | More complex data types can be constructed using references, which allow | |
267 | you to build lists and hashes within lists and hashes. | |
268 | ||
269 | A reference is a scalar value and can refer to any other Perl data | |
e8b5ae53 | 270 | type. So by storing a reference as the value of an array or hash |
41489bc0 | 271 | element, you can easily create lists and hashes within lists and |
e8b5ae53 | 272 | hashes. The following example shows a 2 level hash of hash |
bfe16a1a JH |
273 | structure using anonymous hash references. |
274 | ||
0b00a6f7 KW |
275 | my $variables = { |
276 | scalar => { | |
277 | description => "single item", | |
278 | sigil => '$', | |
279 | }, | |
280 | array => { | |
281 | description => "ordered list of items", | |
282 | sigil => '@', | |
283 | }, | |
284 | hash => { | |
285 | description => "key/value pairs", | |
286 | sigil => '%', | |
287 | }, | |
288 | }; | |
289 | ||
290 | print "Scalars begin with a $variables->{'scalar'}->{'sigil'}\n"; | |
bfe16a1a JH |
291 | |
292 | Exhaustive information on the topic of references can be found in | |
293 | L<perlreftut>, L<perllol>, L<perlref> and L<perldsc>. | |
294 | ||
295 | =head2 Variable scoping | |
296 | ||
297 | Throughout the previous section all the examples have used the syntax: | |
298 | ||
0b00a6f7 | 299 | my $var = "value"; |
bfe16a1a JH |
300 | |
301 | The C<my> is actually not required; you could just use: | |
302 | ||
0b00a6f7 | 303 | $var = "value"; |
bfe16a1a JH |
304 | |
305 | However, the above usage will create global variables throughout your | |
306 | program, which is bad programming practice. C<my> creates lexically | |
307 | scoped variables instead. The variables are scoped to the block | |
308 | (i.e. a bunch of statements surrounded by curly-braces) in which they | |
309 | are defined. | |
310 | ||
0b00a6f7 KW |
311 | my $x = "foo"; |
312 | my $some_condition = 1; | |
313 | if ($some_condition) { | |
314 | my $y = "bar"; | |
315 | print $x; # prints "foo" | |
316 | print $y; # prints "bar" | |
317 | } | |
318 | print $x; # prints "foo" | |
319 | print $y; # prints nothing; $y has fallen out of scope | |
bfe16a1a JH |
320 | |
321 | Using C<my> in combination with a C<use strict;> at the top of | |
41489bc0 | 322 | your Perl scripts means that the interpreter will pick up certain common |
bfe16a1a | 323 | programming errors. For instance, in the example above, the final |
432fb0a0 | 324 | C<print $y> would cause a compile-time error and prevent you from |
bfe16a1a JH |
325 | running the program. Using C<strict> is highly recommended. |
326 | ||
327 | =head2 Conditional and looping constructs | |
328 | ||
e36877c8 | 329 | Perl has most of the usual conditional and looping constructs. As of Perl |
330 | 5.10, it even has a case/switch statement (spelled C<given>/C<when>). See | |
48238296 | 331 | L<perlsyn/"Switch Statements"> for more details. |
bfe16a1a JH |
332 | |
333 | The conditions can be any Perl expression. See the list of operators in | |
41489bc0 | 334 | the next section for information on comparison and boolean logic operators, |
bfe16a1a JH |
335 | which are commonly used in conditional statements. |
336 | ||
337 | =over 4 | |
338 | ||
339 | =item if | |
340 | ||
0b00a6f7 KW |
341 | if ( condition ) { |
342 | ... | |
343 | } elsif ( other condition ) { | |
344 | ... | |
345 | } else { | |
346 | ... | |
347 | } | |
bfe16a1a JH |
348 | |
349 | There's also a negated version of it: | |
350 | ||
0b00a6f7 KW |
351 | unless ( condition ) { |
352 | ... | |
353 | } | |
bfe16a1a | 354 | |
2cd1776c | 355 | This is provided as a more readable version of C<if (!I<condition>)>. |
bfe16a1a JH |
356 | |
357 | Note that the braces are required in Perl, even if you've only got one | |
358 | line in the block. However, there is a clever way of making your one-line | |
359 | conditional blocks more English like: | |
360 | ||
0b00a6f7 KW |
361 | # the traditional way |
362 | if ($zippy) { | |
363 | print "Yow!"; | |
364 | } | |
bfe16a1a | 365 | |
0b00a6f7 KW |
366 | # the Perlish post-condition way |
367 | print "Yow!" if $zippy; | |
368 | print "We have no bananas" unless $bananas; | |
bfe16a1a JH |
369 | |
370 | =item while | |
371 | ||
0b00a6f7 KW |
372 | while ( condition ) { |
373 | ... | |
374 | } | |
bfe16a1a JH |
375 | |
376 | There's also a negated version, for the same reason we have C<unless>: | |
377 | ||
0b00a6f7 KW |
378 | until ( condition ) { |
379 | ... | |
380 | } | |
bfe16a1a JH |
381 | |
382 | You can also use C<while> in a post-condition: | |
383 | ||
0b00a6f7 | 384 | print "LA LA LA\n" while 1; # loops forever |
bfe16a1a JH |
385 | |
386 | =item for | |
387 | ||
388 | Exactly like C: | |
389 | ||
0b00a6f7 KW |
390 | for ($i = 0; $i <= $max; $i++) { |
391 | ... | |
392 | } | |
bfe16a1a JH |
393 | |
394 | The C style for loop is rarely needed in Perl since Perl provides | |
da75cd15 | 395 | the more friendly list scanning C<foreach> loop. |
bfe16a1a JH |
396 | |
397 | =item foreach | |
398 | ||
0b00a6f7 KW |
399 | foreach (@array) { |
400 | print "This element is $_\n"; | |
401 | } | |
bfe16a1a | 402 | |
0b00a6f7 | 403 | print $list[$_] foreach 0 .. $max; |
74375ba5 | 404 | |
0b00a6f7 KW |
405 | # you don't have to use the default $_ either... |
406 | foreach my $key (keys %hash) { | |
407 | print "The value of $key is $hash{$key}\n"; | |
408 | } | |
bfe16a1a | 409 | |
a6642c72 | 410 | The C<foreach> keyword is actually a synonym for the C<for> |
6597fae6 | 411 | keyword. See C<L<perlsyn/"Foreach Loops">>. |
a6642c72 | 412 | |
bfe16a1a JH |
413 | =back |
414 | ||
415 | For more detail on looping constructs (and some that weren't mentioned in | |
416 | this overview) see L<perlsyn>. | |
417 | ||
418 | =head2 Builtin operators and functions | |
419 | ||
420 | Perl comes with a wide selection of builtin functions. Some of the ones | |
421 | we've already seen include C<print>, C<sort> and C<reverse>. A list of | |
41489bc0 | 422 | them is given at the start of L<perlfunc> and you can easily read |
2cd1776c | 423 | about any given function by using C<perldoc -f I<functionname>>. |
bfe16a1a JH |
424 | |
425 | Perl operators are documented in full in L<perlop>, but here are a few | |
426 | of the most common ones: | |
427 | ||
428 | =over 4 | |
429 | ||
430 | =item Arithmetic | |
431 | ||
0b00a6f7 KW |
432 | + addition |
433 | - subtraction | |
434 | * multiplication | |
435 | / division | |
bfe16a1a JH |
436 | |
437 | =item Numeric comparison | |
438 | ||
0b00a6f7 KW |
439 | == equality |
440 | != inequality | |
441 | < less than | |
442 | > greater than | |
443 | <= less than or equal | |
444 | >= greater than or equal | |
bfe16a1a JH |
445 | |
446 | =item String comparison | |
447 | ||
0b00a6f7 KW |
448 | eq equality |
449 | ne inequality | |
450 | lt less than | |
451 | gt greater than | |
452 | le less than or equal | |
453 | ge greater than or equal | |
bfe16a1a | 454 | |
41489bc0 GS |
455 | (Why do we have separate numeric and string comparisons? Because we don't |
456 | have special variable types, and Perl needs to know whether to sort | |
bfe16a1a JH |
457 | numerically (where 99 is less than 100) or alphabetically (where 100 comes |
458 | before 99). | |
459 | ||
460 | =item Boolean logic | |
461 | ||
0b00a6f7 KW |
462 | && and |
463 | || or | |
464 | ! not | |
bfe16a1a | 465 | |
41489bc0 | 466 | (C<and>, C<or> and C<not> aren't just in the above table as descriptions |
e8b5ae53 | 467 | of the operators. They're also supported as operators in their own |
41489bc0 GS |
468 | right. They're more readable than the C-style operators, but have |
469 | different precedence to C<&&> and friends. Check L<perlop> for more | |
bfe16a1a JH |
470 | detail.) |
471 | ||
472 | =item Miscellaneous | |
473 | ||
0b00a6f7 KW |
474 | = assignment |
475 | . string concatenation | |
476 | x string multiplication | |
477 | .. range operator (creates a list of numbers) | |
bfe16a1a JH |
478 | |
479 | =back | |
480 | ||
481 | Many operators can be combined with a C<=> as follows: | |
482 | ||
0b00a6f7 KW |
483 | $a += 1; # same as $a = $a + 1 |
484 | $a -= 1; # same as $a = $a - 1 | |
485 | $a .= "\n"; # same as $a = $a . "\n"; | |
bfe16a1a JH |
486 | |
487 | =head2 Files and I/O | |
488 | ||
489 | You can open a file for input or output using the C<open()> function. | |
41489bc0 | 490 | It's documented in extravagant detail in L<perlfunc> and L<perlopentut>, |
bfe16a1a JH |
491 | but in short: |
492 | ||
0b00a6f7 KW |
493 | open(my $in, "<", "input.txt") or die "Can't open input.txt: $!"; |
494 | open(my $out, ">", "output.txt") or die "Can't open output.txt: $!"; | |
495 | open(my $log, ">>", "my.log") or die "Can't open my.log: $!"; | |
bfe16a1a JH |
496 | |
497 | You can read from an open filehandle using the C<< <> >> operator. In | |
498 | scalar context it reads a single line from the filehandle, and in list | |
499 | context it reads the whole file in, assigning each line to an element of | |
500 | the list: | |
501 | ||
0b00a6f7 KW |
502 | my $line = <$in>; |
503 | my @lines = <$in>; | |
bfe16a1a | 504 | |
e8b5ae53 FC |
505 | Reading in the whole file at one time is called slurping. It can |
506 | be useful but it may be a memory hog. Most text file processing | |
bfe16a1a JH |
507 | can be done a line at a time with Perl's looping constructs. |
508 | ||
509 | The C<< <> >> operator is most often seen in a C<while> loop: | |
510 | ||
0b00a6f7 KW |
511 | while (<$in>) { # assigns each line in turn to $_ |
512 | print "Just read in this line: $_"; | |
513 | } | |
bfe16a1a JH |
514 | |
515 | We've already seen how to print to standard output using C<print()>. | |
516 | However, C<print()> can also take an optional first argument specifying | |
517 | which filehandle to print to: | |
518 | ||
0b00a6f7 KW |
519 | print STDERR "This is your final warning.\n"; |
520 | print $out $record; | |
521 | print $log $logmessage; | |
bfe16a1a JH |
522 | |
523 | When you're done with your filehandles, you should C<close()> them | |
524 | (though to be honest, Perl will clean up after you if you forget): | |
525 | ||
0b00a6f7 | 526 | close $in or die "$in: $!"; |
bfe16a1a JH |
527 | |
528 | =head2 Regular expressions | |
529 | ||
530 | Perl's regular expression support is both broad and deep, and is the | |
531 | subject of lengthy documentation in L<perlrequick>, L<perlretut>, and | |
532 | elsewhere. However, in short: | |
533 | ||
534 | =over 4 | |
535 | ||
536 | =item Simple matching | |
537 | ||
0b00a6f7 KW |
538 | if (/foo/) { ... } # true if $_ contains "foo" |
539 | if ($a =~ /foo/) { ... } # true if $a contains "foo" | |
bfe16a1a JH |
540 | |
541 | The C<//> matching operator is documented in L<perlop>. It operates on | |
542 | C<$_> by default, or can be bound to another variable using the C<=~> | |
543 | binding operator (also documented in L<perlop>). | |
544 | ||
545 | =item Simple substitution | |
546 | ||
0b00a6f7 KW |
547 | s/foo/bar/; # replaces foo with bar in $_ |
548 | $a =~ s/foo/bar/; # replaces foo with bar in $a | |
549 | $a =~ s/foo/bar/g; # replaces ALL INSTANCES of foo with bar | |
550 | # in $a | |
bfe16a1a JH |
551 | |
552 | The C<s///> substitution operator is documented in L<perlop>. | |
553 | ||
554 | =item More complex regular expressions | |
555 | ||
556 | You don't just have to match on fixed strings. In fact, you can match | |
557 | on just about anything you could dream of by using more complex regular | |
558 | expressions. These are documented at great length in L<perlre>, but for | |
559 | the meantime, here's a quick cheat sheet: | |
560 | ||
0b00a6f7 KW |
561 | . a single character |
562 | \s a whitespace character (space, tab, newline, | |
563 | ...) | |
564 | \S non-whitespace character | |
565 | \d a digit (0-9) | |
566 | \D a non-digit | |
567 | \w a word character (a-z, A-Z, 0-9, _) | |
568 | \W a non-word character | |
569 | [aeiou] matches a single character in the given set | |
570 | [^aeiou] matches a single character outside the given | |
571 | set | |
572 | (foo|bar|baz) matches any of the alternatives specified | |
573 | ||
574 | ^ start of string | |
575 | $ end of string | |
bfe16a1a | 576 | |
41489bc0 GS |
577 | Quantifiers can be used to specify how many of the previous thing you |
578 | want to match on, where "thing" means either a literal character, one | |
579 | of the metacharacters listed above, or a group of characters or | |
bfe16a1a JH |
580 | metacharacters in parentheses. |
581 | ||
0b00a6f7 KW |
582 | * zero or more of the previous thing |
583 | + one or more of the previous thing | |
584 | ? zero or one of the previous thing | |
585 | {3} matches exactly 3 of the previous thing | |
586 | {3,6} matches between 3 and 6 of the previous thing | |
587 | {3,} matches 3 or more of the previous thing | |
bfe16a1a JH |
588 | |
589 | Some brief examples: | |
590 | ||
0b00a6f7 KW |
591 | /^\d+/ string starts with one or more digits |
592 | /^$/ nothing in the string (start and end are | |
593 | adjacent) | |
594 | /(\d\s){3}/ three digits, each followed by a whitespace | |
595 | character (eg "3 4 5 ") | |
596 | /(a.)+/ matches a string in which every odd-numbered | |
597 | letter is a (eg "abacadaf") | |
bfe16a1a | 598 | |
0b00a6f7 KW |
599 | # This loop reads from STDIN, and prints non-blank lines: |
600 | while (<>) { | |
601 | next if /^$/; | |
602 | print; | |
603 | } | |
bfe16a1a JH |
604 | |
605 | =item Parentheses for capturing | |
606 | ||
41489bc0 | 607 | As well as grouping, parentheses serve a second purpose. They can be |
bfe16a1a JH |
608 | used to capture the results of parts of the regexp match for later use. |
609 | The results end up in C<$1>, C<$2> and so on. | |
610 | ||
0b00a6f7 | 611 | # a cheap and nasty way to break an email address up into parts |
bfe16a1a | 612 | |
0b00a6f7 KW |
613 | if ($email =~ /([^@]+)@(.+)/) { |
614 | print "Username is $1\n"; | |
615 | print "Hostname is $2\n"; | |
616 | } | |
bfe16a1a JH |
617 | |
618 | =item Other regexp features | |
619 | ||
620 | Perl regexps also support backreferences, lookaheads, and all kinds of | |
621 | other complex details. Read all about them in L<perlrequick>, | |
622 | L<perlretut>, and L<perlre>. | |
623 | ||
624 | =back | |
625 | ||
626 | =head2 Writing subroutines | |
627 | ||
628 | Writing subroutines is easy: | |
629 | ||
0b00a6f7 KW |
630 | sub logger { |
631 | my $logmessage = shift; | |
632 | open my $logfile, ">>", "my.log" or die "Could not open my.log: $!"; | |
633 | print $logfile $logmessage; | |
634 | } | |
bfe16a1a | 635 | |
74375ba5 GS |
636 | Now we can use the subroutine just as any other built-in function: |
637 | ||
0b00a6f7 | 638 | logger("We have a logger subroutine!"); |
74375ba5 | 639 | |
bfe16a1a JH |
640 | What's that C<shift>? Well, the arguments to a subroutine are available |
641 | to us as a special array called C<@_> (see L<perlvar> for more on that). | |
642 | The default argument to the C<shift> function just happens to be C<@_>. | |
643 | So C<my $logmessage = shift;> shifts the first item off the list of | |
41489bc0 | 644 | arguments and assigns it to C<$logmessage>. |
bfe16a1a JH |
645 | |
646 | We can manipulate C<@_> in other ways too: | |
647 | ||
0b00a6f7 KW |
648 | my ($logmessage, $priority) = @_; # common |
649 | my $logmessage = $_[0]; # uncommon, and ugly | |
bfe16a1a JH |
650 | |
651 | Subroutines can also return values: | |
652 | ||
0b00a6f7 KW |
653 | sub square { |
654 | my $num = shift; | |
655 | my $result = $num * $num; | |
656 | return $result; | |
657 | } | |
bfe16a1a | 658 | |
74375ba5 GS |
659 | Then use it like: |
660 | ||
0b00a6f7 | 661 | $sq = square(8); |
74375ba5 | 662 | |
bfe16a1a JH |
663 | For more information on writing subroutines, see L<perlsub>. |
664 | ||
665 | =head2 OO Perl | |
666 | ||
667 | OO Perl is relatively simple and is implemented using references which | |
668 | know what sort of object they are based on Perl's concept of packages. | |
41489bc0 | 669 | However, OO Perl is largely beyond the scope of this document. |
82e1c0d9 | 670 | Read L<perlootut> and L<perlobj>. |
bfe16a1a JH |
671 | |
672 | As a beginning Perl programmer, your most common use of OO Perl will be | |
673 | in using third-party modules, which are documented below. | |
674 | ||
675 | =head2 Using Perl modules | |
676 | ||
677 | Perl modules provide a range of features to help you avoid reinventing | |
f224927c | 678 | the wheel, and can be downloaded from CPAN ( http://www.cpan.org/ ). A |
bfe16a1a JH |
679 | number of popular modules are included with the Perl distribution |
680 | itself. | |
681 | ||
682 | Categories of modules range from text manipulation to network protocols | |
683 | to database integration to graphics. A categorized list of modules is | |
684 | also available from CPAN. | |
685 | ||
686 | To learn how to install modules you download from CPAN, read | |
514f8bac | 687 | L<perlmodinstall>. |
bfe16a1a | 688 | |
2cd1776c AMS |
689 | To learn how to use a particular module, use C<perldoc I<Module::Name>>. |
690 | Typically you will want to C<use I<Module::Name>>, which will then give | |
691 | you access to exported functions or an OO interface to the module. | |
bfe16a1a JH |
692 | |
693 | L<perlfaq> contains questions and answers related to many common | |
694 | tasks, and often provides suggestions for good CPAN modules to use. | |
695 | ||
696 | L<perlmod> describes Perl modules in general. L<perlmodlib> lists the | |
697 | modules which came with your Perl installation. | |
698 | ||
699 | If you feel the urge to write Perl modules, L<perlnewmod> will give you | |
700 | good advice. | |
701 | ||
702 | =head1 AUTHOR | |
703 | ||
704 | Kirrily "Skud" Robert <skud@cpan.org> |