| 1 | =head1 NAME |
| 2 | |
| 3 | perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores) |
| 4 | |
| 5 | =head1 DESCRIPTION |
| 6 | |
| 7 | The basic IPC facilities of Perl are built out of the good old Unix |
| 8 | signals, named pipes, pipe opens, the Berkeley socket routines, and SysV |
| 9 | IPC calls. Each is used in slightly different situations. |
| 10 | |
| 11 | =head1 Signals |
| 12 | |
| 13 | Perl uses a simple signal handling model: the %SIG hash contains names |
| 14 | or references of user-installed signal handlers. These handlers will |
| 15 | be called with an argument which is the name of the signal that |
| 16 | triggered it. A signal may be generated intentionally from a |
| 17 | particular keyboard sequence like control-C or control-Z, sent to you |
| 18 | from another process, or triggered automatically by the kernel when |
| 19 | special events transpire, like a child process exiting, your own process |
| 20 | running out of stack space, or hitting a process file-size limit. |
| 21 | |
| 22 | For example, to trap an interrupt signal, set up a handler like this: |
| 23 | |
| 24 | our $shucks; |
| 25 | |
| 26 | sub catch_zap { |
| 27 | my $signame = shift; |
| 28 | $shucks++; |
| 29 | die "Somebody sent me a SIG$signame"; |
| 30 | } |
| 31 | $SIG{INT} = __PACKAGE__ . "::catch_zap"; |
| 32 | $SIG{INT} = \&catch_zap; # best strategy |
| 33 | |
| 34 | Prior to Perl 5.8.0 it was necessary to do as little as you possibly |
| 35 | could in your handler; notice how all we do is set a global variable |
| 36 | and then raise an exception. That's because on most systems, |
| 37 | libraries are not re-entrant; particularly, memory allocation and I/O |
| 38 | routines are not. That meant that doing nearly I<anything> in your |
| 39 | handler could in theory trigger a memory fault and subsequent core |
| 40 | dump - see L</Deferred Signals (Safe Signals)> below. |
| 41 | |
| 42 | The names of the signals are the ones listed out by C<kill -l> on your |
| 43 | system, or you can retrieve them using the CPAN module L<IPC::Signal>. |
| 44 | |
| 45 | You may also choose to assign the strings C<"IGNORE"> or C<"DEFAULT"> as |
| 46 | the handler, in which case Perl will try to discard the signal or do the |
| 47 | default thing. |
| 48 | |
| 49 | On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal |
| 50 | has special behavior with respect to a value of C<"IGNORE">. |
| 51 | Setting C<$SIG{CHLD}> to C<"IGNORE"> on such a platform has the effect of |
| 52 | not creating zombie processes when the parent process fails to C<wait()> |
| 53 | on its child processes (i.e., child processes are automatically reaped). |
| 54 | Calling C<wait()> with C<$SIG{CHLD}> set to C<"IGNORE"> usually returns |
| 55 | C<-1> on such platforms. |
| 56 | |
| 57 | Some signals can be neither trapped nor ignored, such as the KILL and STOP |
| 58 | (but not the TSTP) signals. Note that ignoring signals makes them disappear. |
| 59 | If you only want them blocked temporarily without them getting lost you'll |
| 60 | have to use POSIX' sigprocmask. |
| 61 | |
| 62 | Sending a signal to a negative process ID means that you send the signal |
| 63 | to the entire Unix process group. This code sends a hang-up signal to all |
| 64 | processes in the current process group, and also sets $SIG{HUP} to C<"IGNORE"> |
| 65 | so it doesn't kill itself: |
| 66 | |
| 67 | # block scope for local |
| 68 | { |
| 69 | local $SIG{HUP} = "IGNORE"; |
| 70 | kill HUP => -$$; |
| 71 | # snazzy writing of: kill("HUP", -$$) |
| 72 | } |
| 73 | |
| 74 | Another interesting signal to send is signal number zero. This doesn't |
| 75 | actually affect a child process, but instead checks whether it's alive |
| 76 | or has changed its UIDs. |
| 77 | |
| 78 | unless (kill 0 => $kid_pid) { |
| 79 | warn "something wicked happened to $kid_pid"; |
| 80 | } |
| 81 | |
| 82 | Signal number zero may fail because you lack permission to send the |
| 83 | signal when directed at a process whose real or saved UID is not |
| 84 | identical to the real or effective UID of the sending process, even |
| 85 | though the process is alive. You may be able to determine the cause of |
| 86 | failure using C<$!> or C<%!>. |
| 87 | |
| 88 | unless (kill(0 => $pid) || $!{EPERM}) { |
| 89 | warn "$pid looks dead"; |
| 90 | } |
| 91 | |
| 92 | You might also want to employ anonymous functions for simple signal |
| 93 | handlers: |
| 94 | |
| 95 | $SIG{INT} = sub { die "\nOutta here!\n" }; |
| 96 | |
| 97 | SIGCHLD handlers require some special care. If a second child dies |
| 98 | while in the signal handler caused by the first death, we won't get |
| 99 | another signal. So must loop here else we will leave the unreaped child |
| 100 | as a zombie. And the next time two children die we get another zombie. |
| 101 | And so on. |
| 102 | |
| 103 | use POSIX ":sys_wait_h"; |
| 104 | $SIG{CHLD} = sub { |
| 105 | while ((my $child = waitpid(-1, WNOHANG)) > 0) { |
| 106 | $Kid_Status{$child} = $?; |
| 107 | } |
| 108 | }; |
| 109 | # do something that forks... |
| 110 | |
| 111 | Be careful: qx(), system(), and some modules for calling external commands |
| 112 | do a fork(), then wait() for the result. Thus, your signal handler |
| 113 | will be called. Because wait() was already called by system() or qx(), |
| 114 | the wait() in the signal handler will see no more zombies and will |
| 115 | therefore block. |
| 116 | |
| 117 | The best way to prevent this issue is to use waitpid(), as in the following |
| 118 | example: |
| 119 | |
| 120 | use POSIX ":sys_wait_h"; # for nonblocking read |
| 121 | |
| 122 | my %children; |
| 123 | |
| 124 | $SIG{CHLD} = sub { |
| 125 | # don't change $! and $? outside handler |
| 126 | local ($!, $?); |
| 127 | while ( (my $pid = waitpid(-1, WNOHANG)) > 0 ) { |
| 128 | delete $children{$pid}; |
| 129 | cleanup_child($pid, $?); |
| 130 | } |
| 131 | }; |
| 132 | |
| 133 | while (1) { |
| 134 | my $pid = fork(); |
| 135 | die "cannot fork" unless defined $pid; |
| 136 | if ($pid == 0) { |
| 137 | # ... |
| 138 | exit 0; |
| 139 | } else { |
| 140 | $children{$pid}=1; |
| 141 | # ... |
| 142 | system($command); |
| 143 | # ... |
| 144 | } |
| 145 | } |
| 146 | |
| 147 | Signal handling is also used for timeouts in Unix. While safely |
| 148 | protected within an C<eval{}> block, you set a signal handler to trap |
| 149 | alarm signals and then schedule to have one delivered to you in some |
| 150 | number of seconds. Then try your blocking operation, clearing the alarm |
| 151 | when it's done but not before you've exited your C<eval{}> block. If it |
| 152 | goes off, you'll use die() to jump out of the block. |
| 153 | |
| 154 | Here's an example: |
| 155 | |
| 156 | my $ALARM_EXCEPTION = "alarm clock restart"; |
| 157 | eval { |
| 158 | local $SIG{ALRM} = sub { die $ALARM_EXCEPTION }; |
| 159 | alarm 10; |
| 160 | flock(FH, 2) # blocking write lock |
| 161 | || die "cannot flock: $!"; |
| 162 | alarm 0; |
| 163 | }; |
| 164 | if ($@ && $@ !~ quotemeta($ALARM_EXCEPTION)) { die } |
| 165 | |
| 166 | If the operation being timed out is system() or qx(), this technique |
| 167 | is liable to generate zombies. If this matters to you, you'll |
| 168 | need to do your own fork() and exec(), and kill the errant child process. |
| 169 | |
| 170 | For more complex signal handling, you might see the standard POSIX |
| 171 | module. Lamentably, this is almost entirely undocumented, but |
| 172 | the F<t/lib/posix.t> file from the Perl source distribution has some |
| 173 | examples in it. |
| 174 | |
| 175 | =head2 Handling the SIGHUP Signal in Daemons |
| 176 | |
| 177 | A process that usually starts when the system boots and shuts down |
| 178 | when the system is shut down is called a daemon (Disk And Execution |
| 179 | MONitor). If a daemon process has a configuration file which is |
| 180 | modified after the process has been started, there should be a way to |
| 181 | tell that process to reread its configuration file without stopping |
| 182 | the process. Many daemons provide this mechanism using a C<SIGHUP> |
| 183 | signal handler. When you want to tell the daemon to reread the file, |
| 184 | simply send it the C<SIGHUP> signal. |
| 185 | |
| 186 | The following example implements a simple daemon, which restarts |
| 187 | itself every time the C<SIGHUP> signal is received. The actual code is |
| 188 | located in the subroutine C<code()>, which just prints some debugging |
| 189 | info to show that it works; it should be replaced with the real code. |
| 190 | |
| 191 | #!/usr/bin/perl -w |
| 192 | |
| 193 | use POSIX (); |
| 194 | use FindBin (); |
| 195 | use File::Basename (); |
| 196 | use File::Spec::Functions; |
| 197 | |
| 198 | $| = 1; |
| 199 | |
| 200 | # make the daemon cross-platform, so exec always calls the script |
| 201 | # itself with the right path, no matter how the script was invoked. |
| 202 | my $script = File::Basename::basename($0); |
| 203 | my $SELF = catfile($FindBin::Bin, $script); |
| 204 | |
| 205 | # POSIX unmasks the sigprocmask properly |
| 206 | $SIG{HUP} = sub { |
| 207 | print "got SIGHUP\n"; |
| 208 | exec($SELF, @ARGV) || die "$0: couldn't restart: $!"; |
| 209 | }; |
| 210 | |
| 211 | code(); |
| 212 | |
| 213 | sub code { |
| 214 | print "PID: $$\n"; |
| 215 | print "ARGV: @ARGV\n"; |
| 216 | my $count = 0; |
| 217 | while (++$count) { |
| 218 | sleep 2; |
| 219 | print "$count\n"; |
| 220 | } |
| 221 | } |
| 222 | |
| 223 | |
| 224 | =head2 Deferred Signals (Safe Signals) |
| 225 | |
| 226 | Before Perl 5.8.0, installing Perl code to deal with signals exposed you to |
| 227 | danger from two things. First, few system library functions are |
| 228 | re-entrant. If the signal interrupts while Perl is executing one function |
| 229 | (like malloc(3) or printf(3)), and your signal handler then calls the same |
| 230 | function again, you could get unpredictable behavior--often, a core dump. |
| 231 | Second, Perl isn't itself re-entrant at the lowest levels. If the signal |
| 232 | interrupts Perl while Perl is changing its own internal data structures, |
| 233 | similarly unpredictable behavior may result. |
| 234 | |
| 235 | There were two things you could do, knowing this: be paranoid or be |
| 236 | pragmatic. The paranoid approach was to do as little as possible in your |
| 237 | signal handler. Set an existing integer variable that already has a |
| 238 | value, and return. This doesn't help you if you're in a slow system call, |
| 239 | which will just restart. That means you have to C<die> to longjmp(3) out |
| 240 | of the handler. Even this is a little cavalier for the true paranoiac, |
| 241 | who avoids C<die> in a handler because the system I<is> out to get you. |
| 242 | The pragmatic approach was to say "I know the risks, but prefer the |
| 243 | convenience", and to do anything you wanted in your signal handler, |
| 244 | and be prepared to clean up core dumps now and again. |
| 245 | |
| 246 | Perl 5.8.0 and later avoid these problems by "deferring" signals. That is, |
| 247 | when the signal is delivered to the process by the system (to the C code |
| 248 | that implements Perl) a flag is set, and the handler returns immediately. |
| 249 | Then at strategic "safe" points in the Perl interpreter (e.g. when it is |
| 250 | about to execute a new opcode) the flags are checked and the Perl level |
| 251 | handler from %SIG is executed. The "deferred" scheme allows much more |
| 252 | flexibility in the coding of signal handlers as we know the Perl |
| 253 | interpreter is in a safe state, and that we are not in a system library |
| 254 | function when the handler is called. However the implementation does |
| 255 | differ from previous Perls in the following ways: |
| 256 | |
| 257 | =over 4 |
| 258 | |
| 259 | =item Long-running opcodes |
| 260 | |
| 261 | As the Perl interpreter looks at signal flags only when it is about |
| 262 | to execute a new opcode, a signal that arrives during a long-running |
| 263 | opcode (e.g. a regular expression operation on a very large string) will |
| 264 | not be seen until the current opcode completes. |
| 265 | |
| 266 | If a signal of any given type fires multiple times during an opcode |
| 267 | (such as from a fine-grained timer), the handler for that signal will |
| 268 | be called only once, after the opcode completes; all other |
| 269 | instances will be discarded. Furthermore, if your system's signal queue |
| 270 | gets flooded to the point that there are signals that have been raised |
| 271 | but not yet caught (and thus not deferred) at the time an opcode |
| 272 | completes, those signals may well be caught and deferred during |
| 273 | subsequent opcodes, with sometimes surprising results. For example, you |
| 274 | may see alarms delivered even after calling C<alarm(0)> as the latter |
| 275 | stops the raising of alarms but does not cancel the delivery of alarms |
| 276 | raised but not yet caught. Do not depend on the behaviors described in |
| 277 | this paragraph as they are side effects of the current implementation and |
| 278 | may change in future versions of Perl. |
| 279 | |
| 280 | =item Interrupting IO |
| 281 | |
| 282 | When a signal is delivered (e.g., SIGINT from a control-C) the operating |
| 283 | system breaks into IO operations like I<read>(2), which is used to |
| 284 | implement Perl's readline() function, the C<< <> >> operator. On older |
| 285 | Perls the handler was called immediately (and as C<read> is not "unsafe", |
| 286 | this worked well). With the "deferred" scheme the handler is I<not> called |
| 287 | immediately, and if Perl is using the system's C<stdio> library that |
| 288 | library may restart the C<read> without returning to Perl to give it a |
| 289 | chance to call the %SIG handler. If this happens on your system the |
| 290 | solution is to use the C<:perlio> layer to do IO--at least on those handles |
| 291 | that you want to be able to break into with signals. (The C<:perlio> layer |
| 292 | checks the signal flags and calls %SIG handlers before resuming IO |
| 293 | operation.) |
| 294 | |
| 295 | The default in Perl 5.8.0 and later is to automatically use |
| 296 | the C<:perlio> layer. |
| 297 | |
| 298 | Note that it is not advisable to access a file handle within a signal |
| 299 | handler where that signal has interrupted an I/O operation on that same |
| 300 | handle. While perl will at least try hard not to crash, there are no |
| 301 | guarantees of data integrity; for example, some data might get dropped or |
| 302 | written twice. |
| 303 | |
| 304 | Some networking library functions like gethostbyname() are known to have |
| 305 | their own implementations of timeouts which may conflict with your |
| 306 | timeouts. If you have problems with such functions, try using the POSIX |
| 307 | sigaction() function, which bypasses Perl safe signals. Be warned that |
| 308 | this does subject you to possible memory corruption, as described above. |
| 309 | |
| 310 | Instead of setting C<$SIG{ALRM}>: |
| 311 | |
| 312 | local $SIG{ALRM} = sub { die "alarm" }; |
| 313 | |
| 314 | try something like the following: |
| 315 | |
| 316 | use POSIX qw(SIGALRM); |
| 317 | POSIX::sigaction(SIGALRM, POSIX::SigAction->new(sub { die "alarm" })) |
| 318 | || die "Error setting SIGALRM handler: $!\n"; |
| 319 | |
| 320 | Another way to disable the safe signal behavior locally is to use |
| 321 | the C<Perl::Unsafe::Signals> module from CPAN, which affects |
| 322 | all signals. |
| 323 | |
| 324 | =item Restartable system calls |
| 325 | |
| 326 | On systems that supported it, older versions of Perl used the |
| 327 | SA_RESTART flag when installing %SIG handlers. This meant that |
| 328 | restartable system calls would continue rather than returning when |
| 329 | a signal arrived. In order to deliver deferred signals promptly, |
| 330 | Perl 5.8.0 and later do I<not> use SA_RESTART. Consequently, |
| 331 | restartable system calls can fail (with $! set to C<EINTR>) in places |
| 332 | where they previously would have succeeded. |
| 333 | |
| 334 | The default C<:perlio> layer retries C<read>, C<write> |
| 335 | and C<close> as described above; interrupted C<wait> and |
| 336 | C<waitpid> calls will always be retried. |
| 337 | |
| 338 | =item Signals as "faults" |
| 339 | |
| 340 | Certain signals like SEGV, ILL, and BUS are generated by virtual memory |
| 341 | addressing errors and similar "faults". These are normally fatal: there is |
| 342 | little a Perl-level handler can do with them. So Perl delivers them |
| 343 | immediately rather than attempting to defer them. |
| 344 | |
| 345 | =item Signals triggered by operating system state |
| 346 | |
| 347 | On some operating systems certain signal handlers are supposed to "do |
| 348 | something" before returning. One example can be CHLD or CLD, which |
| 349 | indicates a child process has completed. On some operating systems the |
| 350 | signal handler is expected to C<wait> for the completed child |
| 351 | process. On such systems the deferred signal scheme will not work for |
| 352 | those signals: it does not do the C<wait>. Again the failure will |
| 353 | look like a loop as the operating system will reissue the signal because |
| 354 | there are completed child processes that have not yet been C<wait>ed for. |
| 355 | |
| 356 | =back |
| 357 | |
| 358 | If you want the old signal behavior back despite possible |
| 359 | memory corruption, set the environment variable C<PERL_SIGNALS> to |
| 360 | C<"unsafe">. This feature first appeared in Perl 5.8.1. |
| 361 | |
| 362 | =head1 Named Pipes |
| 363 | |
| 364 | A named pipe (often referred to as a FIFO) is an old Unix IPC |
| 365 | mechanism for processes communicating on the same machine. It works |
| 366 | just like regular anonymous pipes, except that the |
| 367 | processes rendezvous using a filename and need not be related. |
| 368 | |
| 369 | To create a named pipe, use the C<POSIX::mkfifo()> function. |
| 370 | |
| 371 | use POSIX qw(mkfifo); |
| 372 | mkfifo($path, 0700) || die "mkfifo $path failed: $!"; |
| 373 | |
| 374 | You can also use the Unix command mknod(1), or on some |
| 375 | systems, mkfifo(1). These may not be in your normal path, though. |
| 376 | |
| 377 | # system return val is backwards, so && not || |
| 378 | # |
| 379 | $ENV{PATH} .= ":/etc:/usr/etc"; |
| 380 | if ( system("mknod", $path, "p") |
| 381 | && system("mkfifo", $path) ) |
| 382 | { |
| 383 | die "mk{nod,fifo} $path failed"; |
| 384 | } |
| 385 | |
| 386 | |
| 387 | A fifo is convenient when you want to connect a process to an unrelated |
| 388 | one. When you open a fifo, the program will block until there's something |
| 389 | on the other end. |
| 390 | |
| 391 | For example, let's say you'd like to have your F<.signature> file be a |
| 392 | named pipe that has a Perl program on the other end. Now every time any |
| 393 | program (like a mailer, news reader, finger program, etc.) tries to read |
| 394 | from that file, the reading program will read the new signature from your |
| 395 | program. We'll use the pipe-checking file-test operator, B<-p>, to find |
| 396 | out whether anyone (or anything) has accidentally removed our fifo. |
| 397 | |
| 398 | chdir(); # go home |
| 399 | my $FIFO = ".signature"; |
| 400 | |
| 401 | while (1) { |
| 402 | unless (-p $FIFO) { |
| 403 | unlink $FIFO; # discard any failure, will catch later |
| 404 | require POSIX; # delayed loading of heavy module |
| 405 | POSIX::mkfifo($FIFO, 0700) |
| 406 | || die "can't mkfifo $FIFO: $!"; |
| 407 | } |
| 408 | |
| 409 | # next line blocks till there's a reader |
| 410 | open (FIFO, "> $FIFO") || die "can't open $FIFO: $!"; |
| 411 | print FIFO "John Smith (smith\@host.org)\n", `fortune -s`; |
| 412 | close(FIFO) || die "can't close $FIFO: $!"; |
| 413 | sleep 2; # to avoid dup signals |
| 414 | } |
| 415 | |
| 416 | =head1 Using open() for IPC |
| 417 | |
| 418 | Perl's basic open() statement can also be used for unidirectional |
| 419 | interprocess communication by either appending or prepending a pipe |
| 420 | symbol to the second argument to open(). Here's how to start |
| 421 | something up in a child process you intend to write to: |
| 422 | |
| 423 | open(SPOOLER, "| cat -v | lpr -h 2>/dev/null") |
| 424 | || die "can't fork: $!"; |
| 425 | local $SIG{PIPE} = sub { die "spooler pipe broke" }; |
| 426 | print SPOOLER "stuff\n"; |
| 427 | close SPOOLER || die "bad spool: $! $?"; |
| 428 | |
| 429 | And here's how to start up a child process you intend to read from: |
| 430 | |
| 431 | open(STATUS, "netstat -an 2>&1 |") |
| 432 | || die "can't fork: $!"; |
| 433 | while (<STATUS>) { |
| 434 | next if /^(tcp|udp)/; |
| 435 | print; |
| 436 | } |
| 437 | close STATUS || die "bad netstat: $! $?"; |
| 438 | |
| 439 | If one can be sure that a particular program is a Perl script expecting |
| 440 | filenames in @ARGV, the clever programmer can write something like this: |
| 441 | |
| 442 | % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile |
| 443 | |
| 444 | and no matter which sort of shell it's called from, the Perl program will |
| 445 | read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile> |
| 446 | in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3> |
| 447 | file. Pretty nifty, eh? |
| 448 | |
| 449 | You might notice that you could use backticks for much the |
| 450 | same effect as opening a pipe for reading: |
| 451 | |
| 452 | print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`; |
| 453 | die "bad netstatus ($?)" if $?; |
| 454 | |
| 455 | While this is true on the surface, it's much more efficient to process the |
| 456 | file one line or record at a time because then you don't have to read the |
| 457 | whole thing into memory at once. It also gives you finer control of the |
| 458 | whole process, letting you kill off the child process early if you'd like. |
| 459 | |
| 460 | Be careful to check the return values from both open() and close(). If |
| 461 | you're I<writing> to a pipe, you should also trap SIGPIPE. Otherwise, |
| 462 | think of what happens when you start up a pipe to a command that doesn't |
| 463 | exist: the open() will in all likelihood succeed (it only reflects the |
| 464 | fork()'s success), but then your output will fail--spectacularly. Perl |
| 465 | can't know whether the command worked, because your command is actually |
| 466 | running in a separate process whose exec() might have failed. Therefore, |
| 467 | while readers of bogus commands return just a quick EOF, writers |
| 468 | to bogus commands will get hit with a signal, which they'd best be prepared |
| 469 | to handle. Consider: |
| 470 | |
| 471 | open(FH, "|bogus") || die "can't fork: $!"; |
| 472 | print FH "bang\n"; # neither necessary nor sufficient |
| 473 | # to check print retval! |
| 474 | close(FH) || die "can't close: $!"; |
| 475 | |
| 476 | The reason for not checking the return value from print() is because of |
| 477 | pipe buffering; physical writes are delayed. That won't blow up until the |
| 478 | close, and it will blow up with a SIGPIPE. To catch it, you could use |
| 479 | this: |
| 480 | |
| 481 | $SIG{PIPE} = "IGNORE"; |
| 482 | open(FH, "|bogus") || die "can't fork: $!"; |
| 483 | print FH "bang\n"; |
| 484 | close(FH) || die "can't close: status=$?"; |
| 485 | |
| 486 | =head2 Filehandles |
| 487 | |
| 488 | Both the main process and any child processes it forks share the same |
| 489 | STDIN, STDOUT, and STDERR filehandles. If both processes try to access |
| 490 | them at once, strange things can happen. You may also want to close |
| 491 | or reopen the filehandles for the child. You can get around this by |
| 492 | opening your pipe with open(), but on some systems this means that the |
| 493 | child process cannot outlive the parent. |
| 494 | |
| 495 | =head2 Background Processes |
| 496 | |
| 497 | You can run a command in the background with: |
| 498 | |
| 499 | system("cmd &"); |
| 500 | |
| 501 | The command's STDOUT and STDERR (and possibly STDIN, depending on your |
| 502 | shell) will be the same as the parent's. You won't need to catch |
| 503 | SIGCHLD because of the double-fork taking place; see below for details. |
| 504 | |
| 505 | =head2 Complete Dissociation of Child from Parent |
| 506 | |
| 507 | In some cases (starting server processes, for instance) you'll want to |
| 508 | completely dissociate the child process from the parent. This is |
| 509 | often called daemonization. A well-behaved daemon will also chdir() |
| 510 | to the root directory so it doesn't prevent unmounting the filesystem |
| 511 | containing the directory from which it was launched, and redirect its |
| 512 | standard file descriptors from and to F</dev/null> so that random |
| 513 | output doesn't wind up on the user's terminal. |
| 514 | |
| 515 | use POSIX "setsid"; |
| 516 | |
| 517 | sub daemonize { |
| 518 | chdir("/") || die "can't chdir to /: $!"; |
| 519 | open(STDIN, "< /dev/null") || die "can't read /dev/null: $!"; |
| 520 | open(STDOUT, "> /dev/null") || die "can't write to /dev/null: $!"; |
| 521 | defined(my $pid = fork()) || die "can't fork: $!"; |
| 522 | exit if $pid; # non-zero now means I am the parent |
| 523 | (setsid() != -1) || die "Can't start a new session: $!"; |
| 524 | open(STDERR, ">&STDOUT") || die "can't dup stdout: $!"; |
| 525 | } |
| 526 | |
| 527 | The fork() has to come before the setsid() to ensure you aren't a |
| 528 | process group leader; the setsid() will fail if you are. If your |
| 529 | system doesn't have the setsid() function, open F</dev/tty> and use the |
| 530 | C<TIOCNOTTY> ioctl() on it instead. See tty(4) for details. |
| 531 | |
| 532 | Non-Unix users should check their C<< I<Your_OS>::Process >> module for |
| 533 | other possible solutions. |
| 534 | |
| 535 | =head2 Safe Pipe Opens |
| 536 | |
| 537 | Another interesting approach to IPC is making your single program go |
| 538 | multiprocess and communicate between--or even amongst--yourselves. The |
| 539 | open() function will accept a file argument of either C<"-|"> or C<"|-"> |
| 540 | to do a very interesting thing: it forks a child connected to the |
| 541 | filehandle you've opened. The child is running the same program as the |
| 542 | parent. This is useful for safely opening a file when running under an |
| 543 | assumed UID or GID, for example. If you open a pipe I<to> minus, you can |
| 544 | write to the filehandle you opened and your kid will find it in I<his> |
| 545 | STDIN. If you open a pipe I<from> minus, you can read from the filehandle |
| 546 | you opened whatever your kid writes to I<his> STDOUT. |
| 547 | |
| 548 | use English; |
| 549 | my $PRECIOUS = "/path/to/some/safe/file"; |
| 550 | my $sleep_count; |
| 551 | my $pid; |
| 552 | |
| 553 | do { |
| 554 | $pid = open(KID_TO_WRITE, "|-"); |
| 555 | unless (defined $pid) { |
| 556 | warn "cannot fork: $!"; |
| 557 | die "bailing out" if $sleep_count++ > 6; |
| 558 | sleep 10; |
| 559 | } |
| 560 | } until defined $pid; |
| 561 | |
| 562 | if ($pid) { # I am the parent |
| 563 | print KID_TO_WRITE @some_data; |
| 564 | close(KID_TO_WRITE) || warn "kid exited $?"; |
| 565 | } else { # I am the child |
| 566 | # drop permissions in setuid and/or setgid programs: |
| 567 | ($EUID, $EGID) = ($UID, $GID); |
| 568 | open (OUTFILE, "> $PRECIOUS") |
| 569 | || die "can't open $PRECIOUS: $!"; |
| 570 | while (<STDIN>) { |
| 571 | print OUTFILE; # child's STDIN is parent's KID_TO_WRITE |
| 572 | } |
| 573 | close(OUTFILE) || die "can't close $PRECIOUS: $!"; |
| 574 | exit(0); # don't forget this!! |
| 575 | } |
| 576 | |
| 577 | Another common use for this construct is when you need to execute |
| 578 | something without the shell's interference. With system(), it's |
| 579 | straightforward, but you can't use a pipe open or backticks safely. |
| 580 | That's because there's no way to stop the shell from getting its hands on |
| 581 | your arguments. Instead, use lower-level control to call exec() directly. |
| 582 | |
| 583 | Here's a safe backtick or pipe open for read: |
| 584 | |
| 585 | my $pid = open(KID_TO_READ, "-|"); |
| 586 | defined($pid) || die "can't fork: $!"; |
| 587 | |
| 588 | if ($pid) { # parent |
| 589 | while (<KID_TO_READ>) { |
| 590 | # do something interesting |
| 591 | } |
| 592 | close(KID_TO_READ) || warn "kid exited $?"; |
| 593 | |
| 594 | } else { # child |
| 595 | ($EUID, $EGID) = ($UID, $GID); # suid only |
| 596 | exec($program, @options, @args) |
| 597 | || die "can't exec program: $!"; |
| 598 | # NOTREACHED |
| 599 | } |
| 600 | |
| 601 | And here's a safe pipe open for writing: |
| 602 | |
| 603 | my $pid = open(KID_TO_WRITE, "|-"); |
| 604 | defined($pid) || die "can't fork: $!"; |
| 605 | |
| 606 | $SIG{PIPE} = sub { die "whoops, $program pipe broke" }; |
| 607 | |
| 608 | if ($pid) { # parent |
| 609 | print KID_TO_WRITE @data; |
| 610 | close(KID_TO_WRITE) || warn "kid exited $?"; |
| 611 | |
| 612 | } else { # child |
| 613 | ($EUID, $EGID) = ($UID, $GID); |
| 614 | exec($program, @options, @args) |
| 615 | || die "can't exec program: $!"; |
| 616 | # NOTREACHED |
| 617 | } |
| 618 | |
| 619 | It is very easy to dead-lock a process using this form of open(), or |
| 620 | indeed with any use of pipe() with multiple subprocesses. The |
| 621 | example above is "safe" because it is simple and calls exec(). See |
| 622 | L</"Avoiding Pipe Deadlocks"> for general safety principles, but there |
| 623 | are extra gotchas with Safe Pipe Opens. |
| 624 | |
| 625 | In particular, if you opened the pipe using C<open FH, "|-">, then you |
| 626 | cannot simply use close() in the parent process to close an unwanted |
| 627 | writer. Consider this code: |
| 628 | |
| 629 | my $pid = open(WRITER, "|-"); # fork open a kid |
| 630 | defined($pid) || die "first fork failed: $!"; |
| 631 | if ($pid) { |
| 632 | if (my $sub_pid = fork()) { |
| 633 | defined($sub_pid) || die "second fork failed: $!"; |
| 634 | close(WRITER) || die "couldn't close WRITER: $!"; |
| 635 | # now do something else... |
| 636 | } |
| 637 | else { |
| 638 | # first write to WRITER |
| 639 | # ... |
| 640 | # then when finished |
| 641 | close(WRITER) || die "couldn't close WRITER: $!"; |
| 642 | exit(0); |
| 643 | } |
| 644 | } |
| 645 | else { |
| 646 | # first do something with STDIN, then |
| 647 | exit(0); |
| 648 | } |
| 649 | |
| 650 | In the example above, the true parent does not want to write to the WRITER |
| 651 | filehandle, so it closes it. However, because WRITER was opened using |
| 652 | C<open FH, "|-">, it has a special behavior: closing it calls |
| 653 | waitpid() (see L<perlfunc/waitpid>), which waits for the subprocess |
| 654 | to exit. If the child process ends up waiting for something happening |
| 655 | in the section marked "do something else", you have deadlock. |
| 656 | |
| 657 | This can also be a problem with intermediate subprocesses in more |
| 658 | complicated code, which will call waitpid() on all open filehandles |
| 659 | during global destruction--in no predictable order. |
| 660 | |
| 661 | To solve this, you must manually use pipe(), fork(), and the form of |
| 662 | open() which sets one file descriptor to another, as shown below: |
| 663 | |
| 664 | pipe(READER, WRITER) || die "pipe failed: $!"; |
| 665 | $pid = fork(); |
| 666 | defined($pid) || die "first fork failed: $!"; |
| 667 | if ($pid) { |
| 668 | close READER; |
| 669 | if (my $sub_pid = fork()) { |
| 670 | defined($sub_pid) || die "first fork failed: $!"; |
| 671 | close(WRITER) || die "can't close WRITER: $!"; |
| 672 | } |
| 673 | else { |
| 674 | # write to WRITER... |
| 675 | # ... |
| 676 | # then when finished |
| 677 | close(WRITER) || die "can't close WRITER: $!"; |
| 678 | exit(0); |
| 679 | } |
| 680 | # write to WRITER... |
| 681 | } |
| 682 | else { |
| 683 | open(STDIN, "<&READER") || die "can't reopen STDIN: $!"; |
| 684 | close(WRITER) || die "can't close WRITER: $!"; |
| 685 | # do something... |
| 686 | exit(0); |
| 687 | } |
| 688 | |
| 689 | Since Perl 5.8.0, you can also use the list form of C<open> for pipes. |
| 690 | This is preferred when you wish to avoid having the shell interpret |
| 691 | metacharacters that may be in your command string. |
| 692 | |
| 693 | So for example, instead of using: |
| 694 | |
| 695 | open(PS_PIPE, "ps aux|") || die "can't open ps pipe: $!"; |
| 696 | |
| 697 | One would use either of these: |
| 698 | |
| 699 | open(PS_PIPE, "-|", "ps", "aux") |
| 700 | || die "can't open ps pipe: $!"; |
| 701 | |
| 702 | @ps_args = qw[ ps aux ]; |
| 703 | open(PS_PIPE, "-|", @ps_args) |
| 704 | || die "can't open @ps_args|: $!"; |
| 705 | |
| 706 | Because there are more than three arguments to open(), forks the ps(1) |
| 707 | command I<without> spawning a shell, and reads its standard output via the |
| 708 | C<PS_PIPE> filehandle. The corresponding syntax to I<write> to command |
| 709 | pipes is to use C<"|-"> in place of C<"-|">. |
| 710 | |
| 711 | This was admittedly a rather silly example, because you're using string |
| 712 | literals whose content is perfectly safe. There is therefore no cause to |
| 713 | resort to the harder-to-read, multi-argument form of pipe open(). However, |
| 714 | whenever you cannot be assured that the program arguments are free of shell |
| 715 | metacharacters, the fancier form of open() should be used. For example: |
| 716 | |
| 717 | @grep_args = ("egrep", "-i", $some_pattern, @many_files); |
| 718 | open(GREP_PIPE, "-|", @grep_args) |
| 719 | || die "can't open @grep_args|: $!"; |
| 720 | |
| 721 | Here the multi-argument form of pipe open() is preferred because the |
| 722 | pattern and indeed even the filenames themselves might hold metacharacters. |
| 723 | |
| 724 | Be aware that these operations are full Unix forks, which means they may |
| 725 | not be correctly implemented on all alien systems. |
| 726 | |
| 727 | =head2 Avoiding Pipe Deadlocks |
| 728 | |
| 729 | Whenever you have more than one subprocess, you must be careful that each |
| 730 | closes whichever half of any pipes created for interprocess communication |
| 731 | it is not using. This is because any child process reading from the pipe |
| 732 | and expecting an EOF will never receive it, and therefore never exit. A |
| 733 | single process closing a pipe is not enough to close it; the last process |
| 734 | with the pipe open must close it for it to read EOF. |
| 735 | |
| 736 | Certain built-in Unix features help prevent this most of the time. For |
| 737 | instance, filehandles have a "close on exec" flag, which is set I<en masse> |
| 738 | under control of the C<$^F> variable. This is so any filehandles you |
| 739 | didn't explicitly route to the STDIN, STDOUT or STDERR of a child |
| 740 | I<program> will be automatically closed. |
| 741 | |
| 742 | Always explicitly and immediately call close() on the writable end of any |
| 743 | pipe, unless that process is actually writing to it. Even if you don't |
| 744 | explicitly call close(), Perl will still close() all filehandles during |
| 745 | global destruction. As previously discussed, if those filehandles have |
| 746 | been opened with Safe Pipe Open, this will result in calling waitpid(), |
| 747 | which may again deadlock. |
| 748 | |
| 749 | =head2 Bidirectional Communication with Another Process |
| 750 | |
| 751 | While this works reasonably well for unidirectional communication, what |
| 752 | about bidirectional communication? The most obvious approach doesn't work: |
| 753 | |
| 754 | # THIS DOES NOT WORK!! |
| 755 | open(PROG_FOR_READING_AND_WRITING, "| some program |") |
| 756 | |
| 757 | If you forget to C<use warnings>, you'll miss out entirely on the |
| 758 | helpful diagnostic message: |
| 759 | |
| 760 | Can't do bidirectional pipe at -e line 1. |
| 761 | |
| 762 | If you really want to, you can use the standard open2() from the |
| 763 | C<IPC::Open2> module to catch both ends. There's also an open3() in |
| 764 | C<IPC::Open3> for tridirectional I/O so you can also catch your child's |
| 765 | STDERR, but doing so would then require an awkward select() loop and |
| 766 | wouldn't allow you to use normal Perl input operations. |
| 767 | |
| 768 | If you look at its source, you'll see that open2() uses low-level |
| 769 | primitives like the pipe() and exec() syscalls to create all the |
| 770 | connections. Although it might have been more efficient by using |
| 771 | socketpair(), this would have been even less portable than it already |
| 772 | is. The open2() and open3() functions are unlikely to work anywhere |
| 773 | except on a Unix system, or at least one purporting POSIX compliance. |
| 774 | |
| 775 | =for TODO |
| 776 | Hold on, is this even true? First it says that socketpair() is avoided |
| 777 | for portability, but then it says it probably won't work except on |
| 778 | Unixy systems anyway. Which one of those is true? |
| 779 | |
| 780 | Here's an example of using open2(): |
| 781 | |
| 782 | use FileHandle; |
| 783 | use IPC::Open2; |
| 784 | $pid = open2(*Reader, *Writer, "cat -un"); |
| 785 | print Writer "stuff\n"; |
| 786 | $got = <Reader>; |
| 787 | |
| 788 | The problem with this is that buffering is really going to ruin your |
| 789 | day. Even though your C<Writer> filehandle is auto-flushed so the process |
| 790 | on the other end gets your data in a timely manner, you can't usually do |
| 791 | anything to force that process to give its data to you in a similarly quick |
| 792 | fashion. In this special case, we could actually so, because we gave |
| 793 | I<cat> a B<-u> flag to make it unbuffered. But very few commands are |
| 794 | designed to operate over pipes, so this seldom works unless you yourself |
| 795 | wrote the program on the other end of the double-ended pipe. |
| 796 | |
| 797 | A solution to this is to use a library which uses pseudottys to make your |
| 798 | program behave more reasonably. This way you don't have to have control |
| 799 | over the source code of the program you're using. The C<Expect> module |
| 800 | from CPAN also addresses this kind of thing. This module requires two |
| 801 | other modules from CPAN, C<IO::Pty> and C<IO::Stty>. It sets up a pseudo |
| 802 | terminal to interact with programs that insist on talking to the terminal |
| 803 | device driver. If your system is supported, this may be your best bet. |
| 804 | |
| 805 | =head2 Bidirectional Communication with Yourself |
| 806 | |
| 807 | If you want, you may make low-level pipe() and fork() syscalls to stitch |
| 808 | this together by hand. This example only talks to itself, but you could |
| 809 | reopen the appropriate handles to STDIN and STDOUT and call other processes. |
| 810 | (The following example lacks proper error checking.) |
| 811 | |
| 812 | #!/usr/bin/perl -w |
| 813 | # pipe1 - bidirectional communication using two pipe pairs |
| 814 | # designed for the socketpair-challenged |
| 815 | use IO::Handle; # thousands of lines just for autoflush :-( |
| 816 | pipe(PARENT_RDR, CHILD_WTR); # XXX: check failure? |
| 817 | pipe(CHILD_RDR, PARENT_WTR); # XXX: check failure? |
| 818 | CHILD_WTR->autoflush(1); |
| 819 | PARENT_WTR->autoflush(1); |
| 820 | |
| 821 | if ($pid = fork()) { |
| 822 | close PARENT_RDR; |
| 823 | close PARENT_WTR; |
| 824 | print CHILD_WTR "Parent Pid $$ is sending this\n"; |
| 825 | chomp($line = <CHILD_RDR>); |
| 826 | print "Parent Pid $$ just read this: '$line'\n"; |
| 827 | close CHILD_RDR; close CHILD_WTR; |
| 828 | waitpid($pid, 0); |
| 829 | } else { |
| 830 | die "cannot fork: $!" unless defined $pid; |
| 831 | close CHILD_RDR; |
| 832 | close CHILD_WTR; |
| 833 | chomp($line = <PARENT_RDR>); |
| 834 | print "Child Pid $$ just read this: '$line'\n"; |
| 835 | print PARENT_WTR "Child Pid $$ is sending this\n"; |
| 836 | close PARENT_RDR; |
| 837 | close PARENT_WTR; |
| 838 | exit(0); |
| 839 | } |
| 840 | |
| 841 | But you don't actually have to make two pipe calls. If you |
| 842 | have the socketpair() system call, it will do this all for you. |
| 843 | |
| 844 | #!/usr/bin/perl -w |
| 845 | # pipe2 - bidirectional communication using socketpair |
| 846 | # "the best ones always go both ways" |
| 847 | |
| 848 | use Socket; |
| 849 | use IO::Handle; # thousands of lines just for autoflush :-( |
| 850 | |
| 851 | # We say AF_UNIX because although *_LOCAL is the |
| 852 | # POSIX 1003.1g form of the constant, many machines |
| 853 | # still don't have it. |
| 854 | socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC) |
| 855 | || die "socketpair: $!"; |
| 856 | |
| 857 | CHILD->autoflush(1); |
| 858 | PARENT->autoflush(1); |
| 859 | |
| 860 | if ($pid = fork()) { |
| 861 | close PARENT; |
| 862 | print CHILD "Parent Pid $$ is sending this\n"; |
| 863 | chomp($line = <CHILD>); |
| 864 | print "Parent Pid $$ just read this: '$line'\n"; |
| 865 | close CHILD; |
| 866 | waitpid($pid, 0); |
| 867 | } else { |
| 868 | die "cannot fork: $!" unless defined $pid; |
| 869 | close CHILD; |
| 870 | chomp($line = <PARENT>); |
| 871 | print "Child Pid $$ just read this: '$line'\n"; |
| 872 | print PARENT "Child Pid $$ is sending this\n"; |
| 873 | close PARENT; |
| 874 | exit(0); |
| 875 | } |
| 876 | |
| 877 | =head1 Sockets: Client/Server Communication |
| 878 | |
| 879 | While not entirely limited to Unix-derived operating systems (e.g., WinSock |
| 880 | on PCs provides socket support, as do some VMS libraries), you might not have |
| 881 | sockets on your system, in which case this section probably isn't going to |
| 882 | do you much good. With sockets, you can do both virtual circuits like TCP |
| 883 | streams and datagrams like UDP packets. You may be able to do even more |
| 884 | depending on your system. |
| 885 | |
| 886 | The Perl functions for dealing with sockets have the same names as |
| 887 | the corresponding system calls in C, but their arguments tend to differ |
| 888 | for two reasons. First, Perl filehandles work differently than C file |
| 889 | descriptors. Second, Perl already knows the length of its strings, so you |
| 890 | don't need to pass that information. |
| 891 | |
| 892 | One of the major problems with ancient, antemillennial socket code in Perl |
| 893 | was that it used hard-coded values for some of the constants, which |
| 894 | severely hurt portability. If you ever see code that does anything like |
| 895 | explicitly setting C<$AF_INET = 2>, you know you're in for big trouble. |
| 896 | An immeasurably superior approach is to use the C<Socket> module, which more |
| 897 | reliably grants access to the various constants and functions you'll need. |
| 898 | |
| 899 | If you're not writing a server/client for an existing protocol like |
| 900 | NNTP or SMTP, you should give some thought to how your server will |
| 901 | know when the client has finished talking, and vice-versa. Most |
| 902 | protocols are based on one-line messages and responses (so one party |
| 903 | knows the other has finished when a "\n" is received) or multi-line |
| 904 | messages and responses that end with a period on an empty line |
| 905 | ("\n.\n" terminates a message/response). |
| 906 | |
| 907 | =head2 Internet Line Terminators |
| 908 | |
| 909 | The Internet line terminator is "\015\012". Under ASCII variants of |
| 910 | Unix, that could usually be written as "\r\n", but under other systems, |
| 911 | "\r\n" might at times be "\015\015\012", "\012\012\015", or something |
| 912 | completely different. The standards specify writing "\015\012" to be |
| 913 | conformant (be strict in what you provide), but they also recommend |
| 914 | accepting a lone "\012" on input (be lenient in what you require). |
| 915 | We haven't always been very good about that in the code in this manpage, |
| 916 | but unless you're on a Mac from way back in its pre-Unix dark ages, you'll |
| 917 | probably be ok. |
| 918 | |
| 919 | =head2 Internet TCP Clients and Servers |
| 920 | |
| 921 | Use Internet-domain sockets when you want to do client-server |
| 922 | communication that might extend to machines outside of your own system. |
| 923 | |
| 924 | Here's a sample TCP client using Internet-domain sockets: |
| 925 | |
| 926 | #!/usr/bin/perl -w |
| 927 | use strict; |
| 928 | use Socket; |
| 929 | my ($remote, $port, $iaddr, $paddr, $proto, $line); |
| 930 | |
| 931 | $remote = shift || "localhost"; |
| 932 | $port = shift || 2345; # random port |
| 933 | if ($port =~ /\D/) { $port = getservbyname($port, "tcp") } |
| 934 | die "No port" unless $port; |
| 935 | $iaddr = inet_aton($remote) || die "no host: $remote"; |
| 936 | $paddr = sockaddr_in($port, $iaddr); |
| 937 | |
| 938 | $proto = getprotobyname("tcp"); |
| 939 | socket(SOCK, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; |
| 940 | connect(SOCK, $paddr) || die "connect: $!"; |
| 941 | while ($line = <SOCK>) { |
| 942 | print $line; |
| 943 | } |
| 944 | |
| 945 | close (SOCK) || die "close: $!"; |
| 946 | exit(0); |
| 947 | |
| 948 | And here's a corresponding server to go along with it. We'll |
| 949 | leave the address as C<INADDR_ANY> so that the kernel can choose |
| 950 | the appropriate interface on multihomed hosts. If you want sit |
| 951 | on a particular interface (like the external side of a gateway |
| 952 | or firewall machine), fill this in with your real address instead. |
| 953 | |
| 954 | #!/usr/bin/perl -Tw |
| 955 | use strict; |
| 956 | BEGIN { $ENV{PATH} = "/usr/bin:/bin" } |
| 957 | use Socket; |
| 958 | use Carp; |
| 959 | my $EOL = "\015\012"; |
| 960 | |
| 961 | sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" } |
| 962 | |
| 963 | my $port = shift || 2345; |
| 964 | die "invalid port" unless if $port =~ /^ \d+ $/x; |
| 965 | |
| 966 | my $proto = getprotobyname("tcp"); |
| 967 | |
| 968 | socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; |
| 969 | setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1)) |
| 970 | || die "setsockopt: $!"; |
| 971 | bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!"; |
| 972 | listen(Server, SOMAXCONN) || die "listen: $!"; |
| 973 | |
| 974 | logmsg "server started on port $port"; |
| 975 | |
| 976 | my $paddr; |
| 977 | |
| 978 | $SIG{CHLD} = \&REAPER; |
| 979 | |
| 980 | for ( ; $paddr = accept(Client, Server); close Client) { |
| 981 | my($port, $iaddr) = sockaddr_in($paddr); |
| 982 | my $name = gethostbyaddr($iaddr, AF_INET); |
| 983 | |
| 984 | logmsg "connection from $name [", |
| 985 | inet_ntoa($iaddr), "] |
| 986 | at port $port"; |
| 987 | |
| 988 | print Client "Hello there, $name, it's now ", |
| 989 | scalar localtime(), $EOL; |
| 990 | } |
| 991 | |
| 992 | And here's a multitasking version. It's multitasked in that |
| 993 | like most typical servers, it spawns (fork()s) a slave server to |
| 994 | handle the client request so that the master server can quickly |
| 995 | go back to service a new client. |
| 996 | |
| 997 | #!/usr/bin/perl -Tw |
| 998 | use strict; |
| 999 | BEGIN { $ENV{PATH} = "/usr/bin:/bin" } |
| 1000 | use Socket; |
| 1001 | use Carp; |
| 1002 | my $EOL = "\015\012"; |
| 1003 | |
| 1004 | sub spawn; # forward declaration |
| 1005 | sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" } |
| 1006 | |
| 1007 | my $port = shift || 2345; |
| 1008 | die "invalid port" unless $port =~ /^ \d+ $/x; |
| 1009 | |
| 1010 | my $proto = getprotobyname("tcp"); |
| 1011 | |
| 1012 | socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; |
| 1013 | setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1)) |
| 1014 | || die "setsockopt: $!"; |
| 1015 | bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!"; |
| 1016 | listen(Server, SOMAXCONN) || die "listen: $!"; |
| 1017 | |
| 1018 | logmsg "server started on port $port"; |
| 1019 | |
| 1020 | my $waitedpid = 0; |
| 1021 | my $paddr; |
| 1022 | |
| 1023 | use POSIX ":sys_wait_h"; |
| 1024 | use Errno; |
| 1025 | |
| 1026 | sub REAPER { |
| 1027 | local $!; # don't let waitpid() overwrite current error |
| 1028 | while ((my $pid = waitpid(-1, WNOHANG)) > 0 && WIFEXITED($?)) { |
| 1029 | logmsg "reaped $waitedpid" . ($? ? " with exit $?" : ""); |
| 1030 | } |
| 1031 | $SIG{CHLD} = \&REAPER; # loathe SysV |
| 1032 | } |
| 1033 | |
| 1034 | $SIG{CHLD} = \&REAPER; |
| 1035 | |
| 1036 | while (1) { |
| 1037 | $paddr = accept(Client, Server) || do { |
| 1038 | # try again if accept() returned because got a signal |
| 1039 | next if $!{EINTR}; |
| 1040 | die "accept: $!"; |
| 1041 | }; |
| 1042 | my ($port, $iaddr) = sockaddr_in($paddr); |
| 1043 | my $name = gethostbyaddr($iaddr, AF_INET); |
| 1044 | |
| 1045 | logmsg "connection from $name [", |
| 1046 | inet_ntoa($iaddr), |
| 1047 | "] at port $port"; |
| 1048 | |
| 1049 | spawn sub { |
| 1050 | $| = 1; |
| 1051 | print "Hello there, $name, it's now ", scalar localtime(), $EOL; |
| 1052 | exec "/usr/games/fortune" # XXX: "wrong" line terminators |
| 1053 | or confess "can't exec fortune: $!"; |
| 1054 | }; |
| 1055 | close Client; |
| 1056 | } |
| 1057 | |
| 1058 | sub spawn { |
| 1059 | my $coderef = shift; |
| 1060 | |
| 1061 | unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") { |
| 1062 | confess "usage: spawn CODEREF"; |
| 1063 | } |
| 1064 | |
| 1065 | my $pid; |
| 1066 | unless (defined($pid = fork())) { |
| 1067 | logmsg "cannot fork: $!"; |
| 1068 | return; |
| 1069 | } |
| 1070 | elsif ($pid) { |
| 1071 | logmsg "begat $pid"; |
| 1072 | return; # I'm the parent |
| 1073 | } |
| 1074 | # else I'm the child -- go spawn |
| 1075 | |
| 1076 | open(STDIN, "<&Client") || die "can't dup client to stdin"; |
| 1077 | open(STDOUT, ">&Client") || die "can't dup client to stdout"; |
| 1078 | ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr"; |
| 1079 | exit($coderef->()); |
| 1080 | } |
| 1081 | |
| 1082 | This server takes the trouble to clone off a child version via fork() |
| 1083 | for each incoming request. That way it can handle many requests at |
| 1084 | once, which you might not always want. Even if you don't fork(), the |
| 1085 | listen() will allow that many pending connections. Forking servers |
| 1086 | have to be particularly careful about cleaning up their dead children |
| 1087 | (called "zombies" in Unix parlance), because otherwise you'll quickly |
| 1088 | fill up your process table. The REAPER subroutine is used here to |
| 1089 | call waitpid() for any child processes that have finished, thereby |
| 1090 | ensuring that they terminate cleanly and don't join the ranks of the |
| 1091 | living dead. |
| 1092 | |
| 1093 | Within the while loop we call accept() and check to see if it returns |
| 1094 | a false value. This would normally indicate a system error needs |
| 1095 | to be reported. However, the introduction of safe signals (see |
| 1096 | L</Deferred Signals (Safe Signals)> above) in Perl 5.8.0 means that |
| 1097 | accept() might also be interrupted when the process receives a signal. |
| 1098 | This typically happens when one of the forked subprocesses exits and |
| 1099 | notifies the parent process with a CHLD signal. |
| 1100 | |
| 1101 | If accept() is interrupted by a signal, $! will be set to EINTR. |
| 1102 | If this happens, we can safely continue to the next iteration of |
| 1103 | the loop and another call to accept(). It is important that your |
| 1104 | signal handling code not modify the value of $!, or else this test |
| 1105 | will likely fail. In the REAPER subroutine we create a local version |
| 1106 | of $! before calling waitpid(). When waitpid() sets $! to ECHILD as |
| 1107 | it inevitably does when it has no more children waiting, it |
| 1108 | updates the local copy and leaves the original unchanged. |
| 1109 | |
| 1110 | You should use the B<-T> flag to enable taint checking (see L<perlsec>) |
| 1111 | even if we aren't running setuid or setgid. This is always a good idea |
| 1112 | for servers or any program run on behalf of someone else (like CGI |
| 1113 | scripts), because it lessens the chances that people from the outside will |
| 1114 | be able to compromise your system. |
| 1115 | |
| 1116 | Let's look at another TCP client. This one connects to the TCP "time" |
| 1117 | service on a number of different machines and shows how far their clocks |
| 1118 | differ from the system on which it's being run: |
| 1119 | |
| 1120 | #!/usr/bin/perl -w |
| 1121 | use strict; |
| 1122 | use Socket; |
| 1123 | |
| 1124 | my $SECS_OF_70_YEARS = 2208988800; |
| 1125 | sub ctime { scalar localtime(shift() || time()) } |
| 1126 | |
| 1127 | my $iaddr = gethostbyname("localhost"); |
| 1128 | my $proto = getprotobyname("tcp"); |
| 1129 | my $port = getservbyname("time", "tcp"); |
| 1130 | my $paddr = sockaddr_in(0, $iaddr); |
| 1131 | my($host); |
| 1132 | |
| 1133 | $| = 1; |
| 1134 | printf "%-24s %8s %s\n", "localhost", 0, ctime(); |
| 1135 | |
| 1136 | foreach $host (@ARGV) { |
| 1137 | printf "%-24s ", $host; |
| 1138 | my $hisiaddr = inet_aton($host) || die "unknown host"; |
| 1139 | my $hispaddr = sockaddr_in($port, $hisiaddr); |
| 1140 | socket(SOCKET, PF_INET, SOCK_STREAM, $proto) |
| 1141 | || die "socket: $!"; |
| 1142 | connect(SOCKET, $hispaddr) || die "connect: $!"; |
| 1143 | my $rtime = pack("C4", ()); |
| 1144 | read(SOCKET, $rtime, 4); |
| 1145 | close(SOCKET); |
| 1146 | my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS; |
| 1147 | printf "%8d %s\n", $histime - time(), ctime($histime); |
| 1148 | } |
| 1149 | |
| 1150 | =head2 Unix-Domain TCP Clients and Servers |
| 1151 | |
| 1152 | That's fine for Internet-domain clients and servers, but what about local |
| 1153 | communications? While you can use the same setup, sometimes you don't |
| 1154 | want to. Unix-domain sockets are local to the current host, and are often |
| 1155 | used internally to implement pipes. Unlike Internet domain sockets, Unix |
| 1156 | domain sockets can show up in the file system with an ls(1) listing. |
| 1157 | |
| 1158 | % ls -l /dev/log |
| 1159 | srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log |
| 1160 | |
| 1161 | You can test for these with Perl's B<-S> file test: |
| 1162 | |
| 1163 | unless (-S "/dev/log") { |
| 1164 | die "something's wicked with the log system"; |
| 1165 | } |
| 1166 | |
| 1167 | Here's a sample Unix-domain client: |
| 1168 | |
| 1169 | #!/usr/bin/perl -w |
| 1170 | use Socket; |
| 1171 | use strict; |
| 1172 | my ($rendezvous, $line); |
| 1173 | |
| 1174 | $rendezvous = shift || "catsock"; |
| 1175 | socket(SOCK, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!"; |
| 1176 | connect(SOCK, sockaddr_un($rendezvous)) || die "connect: $!"; |
| 1177 | while (defined($line = <SOCK>)) { |
| 1178 | print $line; |
| 1179 | } |
| 1180 | exit(0); |
| 1181 | |
| 1182 | And here's a corresponding server. You don't have to worry about silly |
| 1183 | network terminators here because Unix domain sockets are guaranteed |
| 1184 | to be on the localhost, and thus everything works right. |
| 1185 | |
| 1186 | #!/usr/bin/perl -Tw |
| 1187 | use strict; |
| 1188 | use Socket; |
| 1189 | use Carp; |
| 1190 | |
| 1191 | BEGIN { $ENV{PATH} = "/usr/bin:/bin" } |
| 1192 | sub spawn; # forward declaration |
| 1193 | sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" } |
| 1194 | |
| 1195 | my $NAME = "catsock"; |
| 1196 | my $uaddr = sockaddr_un($NAME); |
| 1197 | my $proto = getprotobyname("tcp"); |
| 1198 | |
| 1199 | socket(Server, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!"; |
| 1200 | unlink($NAME); |
| 1201 | bind (Server, $uaddr) || die "bind: $!"; |
| 1202 | listen(Server, SOMAXCONN) || die "listen: $!"; |
| 1203 | |
| 1204 | logmsg "server started on $NAME"; |
| 1205 | |
| 1206 | my $waitedpid; |
| 1207 | |
| 1208 | use POSIX ":sys_wait_h"; |
| 1209 | sub REAPER { |
| 1210 | my $child; |
| 1211 | while (($waitedpid = waitpid(-1, WNOHANG)) > 0) { |
| 1212 | logmsg "reaped $waitedpid" . ($? ? " with exit $?" : ""); |
| 1213 | } |
| 1214 | $SIG{CHLD} = \&REAPER; # loathe SysV |
| 1215 | } |
| 1216 | |
| 1217 | $SIG{CHLD} = \&REAPER; |
| 1218 | |
| 1219 | |
| 1220 | for ( $waitedpid = 0; |
| 1221 | accept(Client, Server) || $waitedpid; |
| 1222 | $waitedpid = 0, close Client) |
| 1223 | { |
| 1224 | next if $waitedpid; |
| 1225 | logmsg "connection on $NAME"; |
| 1226 | spawn sub { |
| 1227 | print "Hello there, it's now ", scalar localtime(), "\n"; |
| 1228 | exec("/usr/games/fortune") || die "can't exec fortune: $!"; |
| 1229 | }; |
| 1230 | } |
| 1231 | |
| 1232 | sub spawn { |
| 1233 | my $coderef = shift(); |
| 1234 | |
| 1235 | unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") { |
| 1236 | confess "usage: spawn CODEREF"; |
| 1237 | } |
| 1238 | |
| 1239 | my $pid; |
| 1240 | unless (defined($pid = fork())) { |
| 1241 | logmsg "cannot fork: $!"; |
| 1242 | return; |
| 1243 | } |
| 1244 | elsif ($pid) { |
| 1245 | logmsg "begat $pid"; |
| 1246 | return; # I'm the parent |
| 1247 | } |
| 1248 | else { |
| 1249 | # I'm the child -- go spawn |
| 1250 | } |
| 1251 | |
| 1252 | open(STDIN, "<&Client") || die "can't dup client to stdin"; |
| 1253 | open(STDOUT, ">&Client") || die "can't dup client to stdout"; |
| 1254 | ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr"; |
| 1255 | exit($coderef->()); |
| 1256 | } |
| 1257 | |
| 1258 | As you see, it's remarkably similar to the Internet domain TCP server, so |
| 1259 | much so, in fact, that we've omitted several duplicate functions--spawn(), |
| 1260 | logmsg(), ctime(), and REAPER()--which are the same as in the other server. |
| 1261 | |
| 1262 | So why would you ever want to use a Unix domain socket instead of a |
| 1263 | simpler named pipe? Because a named pipe doesn't give you sessions. You |
| 1264 | can't tell one process's data from another's. With socket programming, |
| 1265 | you get a separate session for each client; that's why accept() takes two |
| 1266 | arguments. |
| 1267 | |
| 1268 | For example, let's say that you have a long-running database server daemon |
| 1269 | that you want folks to be able to access from the Web, but only |
| 1270 | if they go through a CGI interface. You'd have a small, simple CGI |
| 1271 | program that does whatever checks and logging you feel like, and then acts |
| 1272 | as a Unix-domain client and connects to your private server. |
| 1273 | |
| 1274 | =head1 TCP Clients with IO::Socket |
| 1275 | |
| 1276 | For those preferring a higher-level interface to socket programming, the |
| 1277 | IO::Socket module provides an object-oriented approach. If for some reason |
| 1278 | you lack this module, you can just fetch IO::Socket from CPAN, where you'll also |
| 1279 | find modules providing easy interfaces to the following systems: DNS, FTP, |
| 1280 | Ident (RFC 931), NIS and NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, |
| 1281 | Telnet, and Time--to name just a few. |
| 1282 | |
| 1283 | =head2 A Simple Client |
| 1284 | |
| 1285 | Here's a client that creates a TCP connection to the "daytime" |
| 1286 | service at port 13 of the host name "localhost" and prints out everything |
| 1287 | that the server there cares to provide. |
| 1288 | |
| 1289 | #!/usr/bin/perl -w |
| 1290 | use IO::Socket; |
| 1291 | $remote = IO::Socket::INET->new( |
| 1292 | Proto => "tcp", |
| 1293 | PeerAddr => "localhost", |
| 1294 | PeerPort => "daytime(13)", |
| 1295 | ) |
| 1296 | || die "can't connect to daytime service on localhost"; |
| 1297 | while (<$remote>) { print } |
| 1298 | |
| 1299 | When you run this program, you should get something back that |
| 1300 | looks like this: |
| 1301 | |
| 1302 | Wed May 14 08:40:46 MDT 1997 |
| 1303 | |
| 1304 | Here are what those parameters to the new() constructor mean: |
| 1305 | |
| 1306 | =over 4 |
| 1307 | |
| 1308 | =item C<Proto> |
| 1309 | |
| 1310 | This is which protocol to use. In this case, the socket handle returned |
| 1311 | will be connected to a TCP socket, because we want a stream-oriented |
| 1312 | connection, that is, one that acts pretty much like a plain old file. |
| 1313 | Not all sockets are this of this type. For example, the UDP protocol |
| 1314 | can be used to make a datagram socket, used for message-passing. |
| 1315 | |
| 1316 | =item C<PeerAddr> |
| 1317 | |
| 1318 | This is the name or Internet address of the remote host the server is |
| 1319 | running on. We could have specified a longer name like C<"www.perl.com">, |
| 1320 | or an address like C<"207.171.7.72">. For demonstration purposes, we've |
| 1321 | used the special hostname C<"localhost">, which should always mean the |
| 1322 | current machine you're running on. The corresponding Internet address |
| 1323 | for localhost is C<"127.0.0.1">, if you'd rather use that. |
| 1324 | |
| 1325 | =item C<PeerPort> |
| 1326 | |
| 1327 | This is the service name or port number we'd like to connect to. |
| 1328 | We could have gotten away with using just C<"daytime"> on systems with a |
| 1329 | well-configured system services file,[FOOTNOTE: The system services file |
| 1330 | is found in I</etc/services> under Unixy systems.] but here we've specified the |
| 1331 | port number (13) in parentheses. Using just the number would have also |
| 1332 | worked, but numeric literals make careful programmers nervous. |
| 1333 | |
| 1334 | =back |
| 1335 | |
| 1336 | Notice how the return value from the C<new> constructor is used as |
| 1337 | a filehandle in the C<while> loop? That's what's called an I<indirect |
| 1338 | filehandle>, a scalar variable containing a filehandle. You can use |
| 1339 | it the same way you would a normal filehandle. For example, you |
| 1340 | can read one line from it this way: |
| 1341 | |
| 1342 | $line = <$handle>; |
| 1343 | |
| 1344 | all remaining lines from is this way: |
| 1345 | |
| 1346 | @lines = <$handle>; |
| 1347 | |
| 1348 | and send a line of data to it this way: |
| 1349 | |
| 1350 | print $handle "some data\n"; |
| 1351 | |
| 1352 | =head2 A Webget Client |
| 1353 | |
| 1354 | Here's a simple client that takes a remote host to fetch a document |
| 1355 | from, and then a list of files to get from that host. This is a |
| 1356 | more interesting client than the previous one because it first sends |
| 1357 | something to the server before fetching the server's response. |
| 1358 | |
| 1359 | #!/usr/bin/perl -w |
| 1360 | use IO::Socket; |
| 1361 | unless (@ARGV > 1) { die "usage: $0 host url ..." } |
| 1362 | $host = shift(@ARGV); |
| 1363 | $EOL = "\015\012"; |
| 1364 | $BLANK = $EOL x 2; |
| 1365 | for my $document (@ARGV) { |
| 1366 | $remote = IO::Socket::INET->new( Proto => "tcp", |
| 1367 | PeerAddr => $host, |
| 1368 | PeerPort => "http(80)", |
| 1369 | ) || die "cannot connect to httpd on $host"; |
| 1370 | $remote->autoflush(1); |
| 1371 | print $remote "GET $document HTTP/1.0" . $BLANK; |
| 1372 | while ( <$remote> ) { print } |
| 1373 | close $remote; |
| 1374 | } |
| 1375 | |
| 1376 | The web server handling the HTTP service is assumed to be at |
| 1377 | its standard port, number 80. If the server you're trying to |
| 1378 | connect to is at a different port, like 1080 or 8080, you should specify it |
| 1379 | as the named-parameter pair, C<< PeerPort => 8080 >>. The C<autoflush> |
| 1380 | method is used on the socket because otherwise the system would buffer |
| 1381 | up the output we sent it. (If you're on a prehistoric Mac, you'll also |
| 1382 | need to change every C<"\n"> in your code that sends data over the network |
| 1383 | to be a C<"\015\012"> instead.) |
| 1384 | |
| 1385 | Connecting to the server is only the first part of the process: once you |
| 1386 | have the connection, you have to use the server's language. Each server |
| 1387 | on the network has its own little command language that it expects as |
| 1388 | input. The string that we send to the server starting with "GET" is in |
| 1389 | HTTP syntax. In this case, we simply request each specified document. |
| 1390 | Yes, we really are making a new connection for each document, even though |
| 1391 | it's the same host. That's the way you always used to have to speak HTTP. |
| 1392 | Recent versions of web browsers may request that the remote server leave |
| 1393 | the connection open a little while, but the server doesn't have to honor |
| 1394 | such a request. |
| 1395 | |
| 1396 | Here's an example of running that program, which we'll call I<webget>: |
| 1397 | |
| 1398 | % webget www.perl.com /guanaco.html |
| 1399 | HTTP/1.1 404 File Not Found |
| 1400 | Date: Thu, 08 May 1997 18:02:32 GMT |
| 1401 | Server: Apache/1.2b6 |
| 1402 | Connection: close |
| 1403 | Content-type: text/html |
| 1404 | |
| 1405 | <HEAD><TITLE>404 File Not Found</TITLE></HEAD> |
| 1406 | <BODY><H1>File Not Found</H1> |
| 1407 | The requested URL /guanaco.html was not found on this server.<P> |
| 1408 | </BODY> |
| 1409 | |
| 1410 | Ok, so that's not very interesting, because it didn't find that |
| 1411 | particular document. But a long response wouldn't have fit on this page. |
| 1412 | |
| 1413 | For a more featureful version of this program, you should look to |
| 1414 | the I<lwp-request> program included with the LWP modules from CPAN. |
| 1415 | |
| 1416 | =head2 Interactive Client with IO::Socket |
| 1417 | |
| 1418 | Well, that's all fine if you want to send one command and get one answer, |
| 1419 | but what about setting up something fully interactive, somewhat like |
| 1420 | the way I<telnet> works? That way you can type a line, get the answer, |
| 1421 | type a line, get the answer, etc. |
| 1422 | |
| 1423 | This client is more complicated than the two we've done so far, but if |
| 1424 | you're on a system that supports the powerful C<fork> call, the solution |
| 1425 | isn't that rough. Once you've made the connection to whatever service |
| 1426 | you'd like to chat with, call C<fork> to clone your process. Each of |
| 1427 | these two identical process has a very simple job to do: the parent |
| 1428 | copies everything from the socket to standard output, while the child |
| 1429 | simultaneously copies everything from standard input to the socket. |
| 1430 | To accomplish the same thing using just one process would be I<much> |
| 1431 | harder, because it's easier to code two processes to do one thing than it |
| 1432 | is to code one process to do two things. (This keep-it-simple principle |
| 1433 | a cornerstones of the Unix philosophy, and good software engineering as |
| 1434 | well, which is probably why it's spread to other systems.) |
| 1435 | |
| 1436 | Here's the code: |
| 1437 | |
| 1438 | #!/usr/bin/perl -w |
| 1439 | use strict; |
| 1440 | use IO::Socket; |
| 1441 | my ($host, $port, $kidpid, $handle, $line); |
| 1442 | |
| 1443 | unless (@ARGV == 2) { die "usage: $0 host port" } |
| 1444 | ($host, $port) = @ARGV; |
| 1445 | |
| 1446 | # create a tcp connection to the specified host and port |
| 1447 | $handle = IO::Socket::INET->new(Proto => "tcp", |
| 1448 | PeerAddr => $host, |
| 1449 | PeerPort => $port) |
| 1450 | || die "can't connect to port $port on $host: $!"; |
| 1451 | |
| 1452 | $handle->autoflush(1); # so output gets there right away |
| 1453 | print STDERR "[Connected to $host:$port]\n"; |
| 1454 | |
| 1455 | # split the program into two processes, identical twins |
| 1456 | die "can't fork: $!" unless defined($kidpid = fork()); |
| 1457 | |
| 1458 | # the if{} block runs only in the parent process |
| 1459 | if ($kidpid) { |
| 1460 | # copy the socket to standard output |
| 1461 | while (defined ($line = <$handle>)) { |
| 1462 | print STDOUT $line; |
| 1463 | } |
| 1464 | kill("TERM", $kidpid); # send SIGTERM to child |
| 1465 | } |
| 1466 | # the else{} block runs only in the child process |
| 1467 | else { |
| 1468 | # copy standard input to the socket |
| 1469 | while (defined ($line = <STDIN>)) { |
| 1470 | print $handle $line; |
| 1471 | } |
| 1472 | exit(0); # just in case |
| 1473 | } |
| 1474 | |
| 1475 | The C<kill> function in the parent's C<if> block is there to send a |
| 1476 | signal to our child process, currently running in the C<else> block, |
| 1477 | as soon as the remote server has closed its end of the connection. |
| 1478 | |
| 1479 | If the remote server sends data a byte at time, and you need that |
| 1480 | data immediately without waiting for a newline (which might not happen), |
| 1481 | you may wish to replace the C<while> loop in the parent with the |
| 1482 | following: |
| 1483 | |
| 1484 | my $byte; |
| 1485 | while (sysread($handle, $byte, 1) == 1) { |
| 1486 | print STDOUT $byte; |
| 1487 | } |
| 1488 | |
| 1489 | Making a system call for each byte you want to read is not very efficient |
| 1490 | (to put it mildly) but is the simplest to explain and works reasonably |
| 1491 | well. |
| 1492 | |
| 1493 | =head1 TCP Servers with IO::Socket |
| 1494 | |
| 1495 | As always, setting up a server is little bit more involved than running a client. |
| 1496 | The model is that the server creates a special kind of socket that |
| 1497 | does nothing but listen on a particular port for incoming connections. |
| 1498 | It does this by calling the C<< IO::Socket::INET->new() >> method with |
| 1499 | slightly different arguments than the client did. |
| 1500 | |
| 1501 | =over 4 |
| 1502 | |
| 1503 | =item Proto |
| 1504 | |
| 1505 | This is which protocol to use. Like our clients, we'll |
| 1506 | still specify C<"tcp"> here. |
| 1507 | |
| 1508 | =item LocalPort |
| 1509 | |
| 1510 | We specify a local |
| 1511 | port in the C<LocalPort> argument, which we didn't do for the client. |
| 1512 | This is service name or port number for which you want to be the |
| 1513 | server. (Under Unix, ports under 1024 are restricted to the |
| 1514 | superuser.) In our sample, we'll use port 9000, but you can use |
| 1515 | any port that's not currently in use on your system. If you try |
| 1516 | to use one already in used, you'll get an "Address already in use" |
| 1517 | message. Under Unix, the C<netstat -a> command will show |
| 1518 | which services current have servers. |
| 1519 | |
| 1520 | =item Listen |
| 1521 | |
| 1522 | The C<Listen> parameter is set to the maximum number of |
| 1523 | pending connections we can accept until we turn away incoming clients. |
| 1524 | Think of it as a call-waiting queue for your telephone. |
| 1525 | The low-level Socket module has a special symbol for the system maximum, which |
| 1526 | is SOMAXCONN. |
| 1527 | |
| 1528 | =item Reuse |
| 1529 | |
| 1530 | The C<Reuse> parameter is needed so that we restart our server |
| 1531 | manually without waiting a few minutes to allow system buffers to |
| 1532 | clear out. |
| 1533 | |
| 1534 | =back |
| 1535 | |
| 1536 | Once the generic server socket has been created using the parameters |
| 1537 | listed above, the server then waits for a new client to connect |
| 1538 | to it. The server blocks in the C<accept> method, which eventually accepts a |
| 1539 | bidirectional connection from the remote client. (Make sure to autoflush |
| 1540 | this handle to circumvent buffering.) |
| 1541 | |
| 1542 | To add to user-friendliness, our server prompts the user for commands. |
| 1543 | Most servers don't do this. Because of the prompt without a newline, |
| 1544 | you'll have to use the C<sysread> variant of the interactive client above. |
| 1545 | |
| 1546 | This server accepts one of five different commands, sending output back to |
| 1547 | the client. Unlike most network servers, this one handles only one |
| 1548 | incoming client at a time. Multitasking servers are covered in |
| 1549 | Chapter 16 of the Camel. |
| 1550 | |
| 1551 | Here's the code. We'll |
| 1552 | |
| 1553 | #!/usr/bin/perl -w |
| 1554 | use IO::Socket; |
| 1555 | use Net::hostent; # for OOish version of gethostbyaddr |
| 1556 | |
| 1557 | $PORT = 9000; # pick something not in use |
| 1558 | |
| 1559 | $server = IO::Socket::INET->new( Proto => "tcp", |
| 1560 | LocalPort => $PORT, |
| 1561 | Listen => SOMAXCONN, |
| 1562 | Reuse => 1); |
| 1563 | |
| 1564 | die "can't setup server" unless $server; |
| 1565 | print "[Server $0 accepting clients]\n"; |
| 1566 | |
| 1567 | while ($client = $server->accept()) { |
| 1568 | $client->autoflush(1); |
| 1569 | print $client "Welcome to $0; type help for command list.\n"; |
| 1570 | $hostinfo = gethostbyaddr($client->peeraddr); |
| 1571 | printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost; |
| 1572 | print $client "Command? "; |
| 1573 | while ( <$client>) { |
| 1574 | next unless /\S/; # blank line |
| 1575 | if (/quit|exit/i) { last } |
| 1576 | elsif (/date|time/i) { printf $client "%s\n", scalar localtime() } |
| 1577 | elsif (/who/i ) { print $client `who 2>&1` } |
| 1578 | elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1` } |
| 1579 | elsif (/motd/i ) { print $client `cat /etc/motd 2>&1` } |
| 1580 | else { |
| 1581 | print $client "Commands: quit date who cookie motd\n"; |
| 1582 | } |
| 1583 | } continue { |
| 1584 | print $client "Command? "; |
| 1585 | } |
| 1586 | close $client; |
| 1587 | } |
| 1588 | |
| 1589 | =head1 UDP: Message Passing |
| 1590 | |
| 1591 | Another kind of client-server setup is one that uses not connections, but |
| 1592 | messages. UDP communications involve much lower overhead but also provide |
| 1593 | less reliability, as there are no promises that messages will arrive at |
| 1594 | all, let alone in order and unmangled. Still, UDP offers some advantages |
| 1595 | over TCP, including being able to "broadcast" or "multicast" to a whole |
| 1596 | bunch of destination hosts at once (usually on your local subnet). If you |
| 1597 | find yourself overly concerned about reliability and start building checks |
| 1598 | into your message system, then you probably should use just TCP to start |
| 1599 | with. |
| 1600 | |
| 1601 | UDP datagrams are I<not> a bytestream and should not be treated as such. |
| 1602 | This makes using I/O mechanisms with internal buffering like stdio (i.e. |
| 1603 | print() and friends) especially cumbersome. Use syswrite(), or better |
| 1604 | send(), like in the example below. |
| 1605 | |
| 1606 | Here's a UDP program similar to the sample Internet TCP client given |
| 1607 | earlier. However, instead of checking one host at a time, the UDP version |
| 1608 | will check many of them asynchronously by simulating a multicast and then |
| 1609 | using select() to do a timed-out wait for I/O. To do something similar |
| 1610 | with TCP, you'd have to use a different socket handle for each host. |
| 1611 | |
| 1612 | #!/usr/bin/perl -w |
| 1613 | use strict; |
| 1614 | use Socket; |
| 1615 | use Sys::Hostname; |
| 1616 | |
| 1617 | my ( $count, $hisiaddr, $hispaddr, $histime, |
| 1618 | $host, $iaddr, $paddr, $port, $proto, |
| 1619 | $rin, $rout, $rtime, $SECS_OF_70_YEARS); |
| 1620 | |
| 1621 | $SECS_OF_70_YEARS = 2_208_988_800; |
| 1622 | |
| 1623 | $iaddr = gethostbyname(hostname()); |
| 1624 | $proto = getprotobyname("udp"); |
| 1625 | $port = getservbyname("time", "udp"); |
| 1626 | $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick |
| 1627 | |
| 1628 | socket(SOCKET, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!"; |
| 1629 | bind(SOCKET, $paddr) || die "bind: $!"; |
| 1630 | |
| 1631 | $| = 1; |
| 1632 | printf "%-12s %8s %s\n", "localhost", 0, scalar localtime(); |
| 1633 | $count = 0; |
| 1634 | for $host (@ARGV) { |
| 1635 | $count++; |
| 1636 | $hisiaddr = inet_aton($host) || die "unknown host"; |
| 1637 | $hispaddr = sockaddr_in($port, $hisiaddr); |
| 1638 | defined(send(SOCKET, 0, 0, $hispaddr)) || die "send $host: $!"; |
| 1639 | } |
| 1640 | |
| 1641 | $rin = ""; |
| 1642 | vec($rin, fileno(SOCKET), 1) = 1; |
| 1643 | |
| 1644 | # timeout after 10.0 seconds |
| 1645 | while ($count && select($rout = $rin, undef, undef, 10.0)) { |
| 1646 | $rtime = ""; |
| 1647 | $hispaddr = recv(SOCKET, $rtime, 4, 0) || die "recv: $!"; |
| 1648 | ($port, $hisiaddr) = sockaddr_in($hispaddr); |
| 1649 | $host = gethostbyaddr($hisiaddr, AF_INET); |
| 1650 | $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS; |
| 1651 | printf "%-12s ", $host; |
| 1652 | printf "%8d %s\n", $histime - time(), scalar localtime($histime); |
| 1653 | $count--; |
| 1654 | } |
| 1655 | |
| 1656 | This example does not include any retries and may consequently fail to |
| 1657 | contact a reachable host. The most prominent reason for this is congestion |
| 1658 | of the queues on the sending host if the number of hosts to contact is |
| 1659 | sufficiently large. |
| 1660 | |
| 1661 | =head1 SysV IPC |
| 1662 | |
| 1663 | While System V IPC isn't so widely used as sockets, it still has some |
| 1664 | interesting uses. However, you cannot use SysV IPC or Berkeley mmap() to |
| 1665 | have a variable shared amongst several processes. That's because Perl |
| 1666 | would reallocate your string when you weren't wanting it to. You might |
| 1667 | look into the C<IPC::Shareable> or C<threads::shared> modules for that. |
| 1668 | |
| 1669 | Here's a small example showing shared memory usage. |
| 1670 | |
| 1671 | use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRUSR S_IWUSR); |
| 1672 | |
| 1673 | $size = 2000; |
| 1674 | $id = shmget(IPC_PRIVATE, $size, S_IRUSR | S_IWUSR); |
| 1675 | defined($id) || die "shmget: $!"; |
| 1676 | print "shm key $id\n"; |
| 1677 | |
| 1678 | $message = "Message #1"; |
| 1679 | shmwrite($id, $message, 0, 60) || die "shmwrite: $!"; |
| 1680 | print "wrote: '$message'\n"; |
| 1681 | shmread($id, $buff, 0, 60) || die "shmread: $!"; |
| 1682 | print "read : '$buff'\n"; |
| 1683 | |
| 1684 | # the buffer of shmread is zero-character end-padded. |
| 1685 | substr($buff, index($buff, "\0")) = ""; |
| 1686 | print "un" unless $buff eq $message; |
| 1687 | print "swell\n"; |
| 1688 | |
| 1689 | print "deleting shm $id\n"; |
| 1690 | shmctl($id, IPC_RMID, 0) || die "shmctl: $!"; |
| 1691 | |
| 1692 | Here's an example of a semaphore: |
| 1693 | |
| 1694 | use IPC::SysV qw(IPC_CREAT); |
| 1695 | |
| 1696 | $IPC_KEY = 1234; |
| 1697 | $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT); |
| 1698 | defined($id) || die "semget: $!"; |
| 1699 | print "sem id $id\n"; |
| 1700 | |
| 1701 | Put this code in a separate file to be run in more than one process. |
| 1702 | Call the file F<take>: |
| 1703 | |
| 1704 | # create a semaphore |
| 1705 | |
| 1706 | $IPC_KEY = 1234; |
| 1707 | $id = semget($IPC_KEY, 0, 0); |
| 1708 | defined($id) || die "semget: $!"; |
| 1709 | |
| 1710 | $semnum = 0; |
| 1711 | $semflag = 0; |
| 1712 | |
| 1713 | # "take" semaphore |
| 1714 | # wait for semaphore to be zero |
| 1715 | $semop = 0; |
| 1716 | $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag); |
| 1717 | |
| 1718 | # Increment the semaphore count |
| 1719 | $semop = 1; |
| 1720 | $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag); |
| 1721 | $opstring = $opstring1 . $opstring2; |
| 1722 | |
| 1723 | semop($id, $opstring) || die "semop: $!"; |
| 1724 | |
| 1725 | Put this code in a separate file to be run in more than one process. |
| 1726 | Call this file F<give>: |
| 1727 | |
| 1728 | # "give" the semaphore |
| 1729 | # run this in the original process and you will see |
| 1730 | # that the second process continues |
| 1731 | |
| 1732 | $IPC_KEY = 1234; |
| 1733 | $id = semget($IPC_KEY, 0, 0); |
| 1734 | die unless defined($id); |
| 1735 | |
| 1736 | $semnum = 0; |
| 1737 | $semflag = 0; |
| 1738 | |
| 1739 | # Decrement the semaphore count |
| 1740 | $semop = -1; |
| 1741 | $opstring = pack("s!s!s!", $semnum, $semop, $semflag); |
| 1742 | |
| 1743 | semop($id, $opstring) || die "semop: $!"; |
| 1744 | |
| 1745 | The SysV IPC code above was written long ago, and it's definitely |
| 1746 | clunky looking. For a more modern look, see the IPC::SysV module. |
| 1747 | |
| 1748 | A small example demonstrating SysV message queues: |
| 1749 | |
| 1750 | use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRUSR S_IWUSR); |
| 1751 | |
| 1752 | my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRUSR | S_IWUSR); |
| 1753 | defined($id) || die "msgget failed: $!"; |
| 1754 | |
| 1755 | my $sent = "message"; |
| 1756 | my $type_sent = 1234; |
| 1757 | |
| 1758 | msgsnd($id, pack("l! a*", $type_sent, $sent), 0) |
| 1759 | || die "msgsnd failed: $!"; |
| 1760 | |
| 1761 | msgrcv($id, my $rcvd_buf, 60, 0, 0) |
| 1762 | || die "msgrcv failed: $!"; |
| 1763 | |
| 1764 | my($type_rcvd, $rcvd) = unpack("l! a*", $rcvd_buf); |
| 1765 | |
| 1766 | if ($rcvd eq $sent) { |
| 1767 | print "okay\n"; |
| 1768 | } else { |
| 1769 | print "not okay\n"; |
| 1770 | } |
| 1771 | |
| 1772 | msgctl($id, IPC_RMID, 0) || die "msgctl failed: $!\n"; |
| 1773 | |
| 1774 | =head1 NOTES |
| 1775 | |
| 1776 | Most of these routines quietly but politely return C<undef> when they |
| 1777 | fail instead of causing your program to die right then and there due to |
| 1778 | an uncaught exception. (Actually, some of the new I<Socket> conversion |
| 1779 | functions do croak() on bad arguments.) It is therefore essential to |
| 1780 | check return values from these functions. Always begin your socket |
| 1781 | programs this way for optimal success, and don't forget to add the B<-T> |
| 1782 | taint-checking flag to the C<#!> line for servers: |
| 1783 | |
| 1784 | #!/usr/bin/perl -Tw |
| 1785 | use strict; |
| 1786 | use sigtrap; |
| 1787 | use Socket; |
| 1788 | |
| 1789 | =head1 BUGS |
| 1790 | |
| 1791 | These routines all create system-specific portability problems. As noted |
| 1792 | elsewhere, Perl is at the mercy of your C libraries for much of its system |
| 1793 | behavior. It's probably safest to assume broken SysV semantics for |
| 1794 | signals and to stick with simple TCP and UDP socket operations; e.g., don't |
| 1795 | try to pass open file descriptors over a local UDP datagram socket if you |
| 1796 | want your code to stand a chance of being portable. |
| 1797 | |
| 1798 | =head1 AUTHOR |
| 1799 | |
| 1800 | Tom Christiansen, with occasional vestiges of Larry Wall's original |
| 1801 | version and suggestions from the Perl Porters. |
| 1802 | |
| 1803 | =head1 SEE ALSO |
| 1804 | |
| 1805 | There's a lot more to networking than this, but this should get you |
| 1806 | started. |
| 1807 | |
| 1808 | For intrepid programmers, the indispensable textbook is I<Unix Network |
| 1809 | Programming, 2nd Edition, Volume 1> by W. Richard Stevens (published by |
| 1810 | Prentice-Hall). Most books on networking address the subject from the |
| 1811 | perspective of a C programmer; translation to Perl is left as an exercise |
| 1812 | for the reader. |
| 1813 | |
| 1814 | The IO::Socket(3) manpage describes the object library, and the Socket(3) |
| 1815 | manpage describes the low-level interface to sockets. Besides the obvious |
| 1816 | functions in L<perlfunc>, you should also check out the F<modules> file at |
| 1817 | your nearest CPAN site, especially |
| 1818 | L<http://www.cpan.org/modules/00modlist.long.html#ID5_Networking_>. |
| 1819 | See L<perlmodlib> or best yet, the F<Perl FAQ> for a description |
| 1820 | of what CPAN is and where to get it if the previous link doesn't work |
| 1821 | for you. |
| 1822 | |
| 1823 | Section 5 of CPAN's F<modules> file is devoted to "Networking, Device |
| 1824 | Control (modems), and Interprocess Communication", and contains numerous |
| 1825 | unbundled modules numerous networking modules, Chat and Expect operations, |
| 1826 | CGI programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet, |
| 1827 | Threads, and ToolTalk--to name just a few. |