| 1 | =head1 NAME |
| 2 | |
| 3 | perlclib - Internal replacements for standard C library functions |
| 4 | |
| 5 | =head1 DESCRIPTION |
| 6 | |
| 7 | One thing Perl porters should note is that F<perl> doesn't tend to use that |
| 8 | much of the C standard library internally; you'll see very little use of, |
| 9 | for example, the F<ctype.h> functions in there. This is because Perl |
| 10 | tends to reimplement or abstract standard library functions, so that we |
| 11 | know exactly how they're going to operate. |
| 12 | |
| 13 | This is a reference card for people who are familiar with the C library |
| 14 | and who want to do things the Perl way; to tell them which functions |
| 15 | they ought to use instead of the more normal C functions. |
| 16 | |
| 17 | =head2 Conventions |
| 18 | |
| 19 | In the following tables: |
| 20 | |
| 21 | =over 3 |
| 22 | |
| 23 | =item C<t> |
| 24 | |
| 25 | is a type. |
| 26 | |
| 27 | =item C<p> |
| 28 | |
| 29 | is a pointer. |
| 30 | |
| 31 | =item C<n> |
| 32 | |
| 33 | is a number. |
| 34 | |
| 35 | =item C<s> |
| 36 | |
| 37 | is a string. |
| 38 | |
| 39 | =back |
| 40 | |
| 41 | C<sv>, C<av>, C<hv>, etc. represent variables of their respective types. |
| 42 | |
| 43 | =head2 File Operations |
| 44 | |
| 45 | Instead of the F<stdio.h> functions, you should use the Perl abstraction |
| 46 | layer. Instead of C<FILE*> types, you need to be handling C<PerlIO*> |
| 47 | types. Don't forget that with the new PerlIO layered I/O abstraction |
| 48 | C<FILE*> types may not even be available. See also the C<perlapio> |
| 49 | documentation for more information about the following functions: |
| 50 | |
| 51 | Instead Of: Use: |
| 52 | |
| 53 | stdin PerlIO_stdin() |
| 54 | stdout PerlIO_stdout() |
| 55 | stderr PerlIO_stderr() |
| 56 | |
| 57 | fopen(fn, mode) PerlIO_open(fn, mode) |
| 58 | freopen(fn, mode, stream) PerlIO_reopen(fn, mode, perlio) (Dep- |
| 59 | recated) |
| 60 | fflush(stream) PerlIO_flush(perlio) |
| 61 | fclose(stream) PerlIO_close(perlio) |
| 62 | |
| 63 | =head2 File Input and Output |
| 64 | |
| 65 | Instead Of: Use: |
| 66 | |
| 67 | fprintf(stream, fmt, ...) PerlIO_printf(perlio, fmt, ...) |
| 68 | |
| 69 | [f]getc(stream) PerlIO_getc(perlio) |
| 70 | [f]putc(stream, n) PerlIO_putc(perlio, n) |
| 71 | ungetc(n, stream) PerlIO_ungetc(perlio, n) |
| 72 | |
| 73 | Note that the PerlIO equivalents of C<fread> and C<fwrite> are slightly |
| 74 | different from their C library counterparts: |
| 75 | |
| 76 | fread(p, size, n, stream) PerlIO_read(perlio, buf, numbytes) |
| 77 | fwrite(p, size, n, stream) PerlIO_write(perlio, buf, numbytes) |
| 78 | |
| 79 | fputs(s, stream) PerlIO_puts(perlio, s) |
| 80 | |
| 81 | There is no equivalent to C<fgets>; one should use C<sv_gets> instead: |
| 82 | |
| 83 | fgets(s, n, stream) sv_gets(sv, perlio, append) |
| 84 | |
| 85 | =head2 File Positioning |
| 86 | |
| 87 | Instead Of: Use: |
| 88 | |
| 89 | feof(stream) PerlIO_eof(perlio) |
| 90 | fseek(stream, n, whence) PerlIO_seek(perlio, n, whence) |
| 91 | rewind(stream) PerlIO_rewind(perlio) |
| 92 | |
| 93 | fgetpos(stream, p) PerlIO_getpos(perlio, sv) |
| 94 | fsetpos(stream, p) PerlIO_setpos(perlio, sv) |
| 95 | |
| 96 | ferror(stream) PerlIO_error(perlio) |
| 97 | clearerr(stream) PerlIO_clearerr(perlio) |
| 98 | |
| 99 | =head2 Memory Management and String Handling |
| 100 | |
| 101 | Instead Of: Use: |
| 102 | |
| 103 | t* p = malloc(n) Newx(p, n, t) |
| 104 | t* p = calloc(n, s) Newxz(p, n, t) |
| 105 | p = realloc(p, n) Renew(p, n, t) |
| 106 | memcpy(dst, src, n) Copy(src, dst, n, t) |
| 107 | memmove(dst, src, n) Move(src, dst, n, t) |
| 108 | memcpy(dst, src, sizeof(t)) StructCopy(src, dst, t) |
| 109 | memset(dst, 0, n * sizeof(t)) Zero(dst, n, t) |
| 110 | memzero(dst, 0) Zero(dst, n, char) |
| 111 | free(p) Safefree(p) |
| 112 | |
| 113 | strdup(p) savepv(p) |
| 114 | strndup(p, n) savepvn(p, n) (Hey, strndup doesn't |
| 115 | exist!) |
| 116 | |
| 117 | strstr(big, little) instr(big, little) |
| 118 | strcmp(s1, s2) strLE(s1, s2) / strEQ(s1, s2) |
| 119 | / strGT(s1,s2) |
| 120 | strncmp(s1, s2, n) strnNE(s1, s2, n) / strnEQ(s1, s2, n) |
| 121 | |
| 122 | Notice the different order of arguments to C<Copy> and C<Move> than used |
| 123 | in C<memcpy> and C<memmove>. |
| 124 | |
| 125 | Most of the time, though, you'll want to be dealing with SVs internally |
| 126 | instead of raw C<char *> strings: |
| 127 | |
| 128 | strlen(s) sv_len(sv) |
| 129 | strcpy(dt, src) sv_setpv(sv, s) |
| 130 | strncpy(dt, src, n) sv_setpvn(sv, s, n) |
| 131 | strcat(dt, src) sv_catpv(sv, s) |
| 132 | strncat(dt, src) sv_catpvn(sv, s) |
| 133 | sprintf(s, fmt, ...) sv_setpvf(sv, fmt, ...) |
| 134 | |
| 135 | Note also the existence of C<sv_catpvf> and C<sv_vcatpvfn>, combining |
| 136 | concatenation with formatting. |
| 137 | |
| 138 | Sometimes instead of zeroing the allocated heap by using Newxz() you |
| 139 | should consider "poisoning" the data. This means writing a bit |
| 140 | pattern into it that should be illegal as pointers (and floating point |
| 141 | numbers), and also hopefully surprising enough as integers, so that |
| 142 | any code attempting to use the data without forethought will break |
| 143 | sooner rather than later. Poisoning can be done using the Poison() |
| 144 | macros, which have similar arguments to Zero(): |
| 145 | |
| 146 | PoisonWith(dst, n, t, b) scribble memory with byte b |
| 147 | PoisonNew(dst, n, t) equal to PoisonWith(dst, n, t, 0xAB) |
| 148 | PoisonFree(dst, n, t) equal to PoisonWith(dst, n, t, 0xEF) |
| 149 | Poison(dst, n, t) equal to PoisonFree(dst, n, t) |
| 150 | |
| 151 | =head2 Character Class Tests |
| 152 | |
| 153 | There are several types of character class tests that Perl implements. |
| 154 | The only ones described here are those that directly correspond to C |
| 155 | library functions that operate on 8-bit characters, but there are |
| 156 | equivalents that operate on wide characters, and UTF-8 encoded strings. |
| 157 | All are more fully described in L<perlapi/Character classes> and |
| 158 | L<perlapi/Character case changing>. |
| 159 | |
| 160 | The C library routines listed in the table below return values based on |
| 161 | the current locale. Use the entries in the final column for that |
| 162 | functionality. The other two columns always assume a POSIX (or C) |
| 163 | locale. The entries in the ASCII column are only meaningful for ASCII |
| 164 | inputs, returning FALSE for anything else. Use these only when you |
| 165 | B<know> that is what you want. The entries in the Latin1 column assume |
| 166 | that the non-ASCII 8-bit characters are as Unicode defines, them, the |
| 167 | same as ISO-8859-1, often called Latin 1. |
| 168 | |
| 169 | Instead Of: Use for ASCII: Use for Latin1: Use for locale: |
| 170 | |
| 171 | isalnum(c) isALPHANUMERIC(c) isALPHANUMERIC_L1(c) isALPHANUMERIC_LC(c) |
| 172 | isalpha(c) isALPHA(c) isALPHA_L1(c) isALPHA_LC(u ) |
| 173 | isascii(c) isASCII(c) isASCII_LC(c) |
| 174 | isblank(c) isBLANK(c) isBLANK_L1(c) isBLANK_LC(c) |
| 175 | iscntrl(c) isCNTRL(c) isCNTRL_L1(c) isCNTRL_LC(c) |
| 176 | isdigit(c) isDIGIT(c) isDIGIT_L1(c) isDIGIT_LC(c) |
| 177 | isgraph(c) isGRAPH(c) isGRAPH_L1(c) isGRAPH_LC(c) |
| 178 | islower(c) isLOWER(c) isLOWER_L1(c) isLOWER_LC(c) |
| 179 | isprint(c) isPRINT(c) isPRINT_L1(c) isPRINT_LC(c) |
| 180 | ispunct(c) isPUNCT(c) isPUNCT_L1(c) isPUNCT_LC(c) |
| 181 | isspace(c) isSPACE(c) isSPACE_L1(c) isSPACE_LC(c) |
| 182 | isupper(c) isUPPER(c) isUPPER_L1(c) isUPPER_LC(c) |
| 183 | isxdigit(c) isXDIGIT(c) isXDIGIT_L1(c) isXDIGIT_LC(c) |
| 184 | |
| 185 | tolower(c) toLOWER(c) toLOWER_L1(c) toLOWER_LC(c) |
| 186 | toupper(c) toUPPER(c) toUPPER_LC(c) |
| 187 | |
| 188 | To emphasize that you are operating only on ASCII characters, you can |
| 189 | append C<_A> to each of the macros in the ASCII column: C<isALPHA_A>, |
| 190 | C<isDIGIT_A>, and so on. |
| 191 | |
| 192 | (There is no entry in the Latin1 column for C<isascii> even though there |
| 193 | is an C<isASCII_L1>, which is identical to C<isASCII>; the |
| 194 | latter name is clearer. There is no entry in the Latin1 column for |
| 195 | C<toupper> because the result can be non-Latin1. You have to use |
| 196 | C<toUPPER_uni>, as described in L<perlapi/Character case changing>.) |
| 197 | |
| 198 | =head2 F<stdlib.h> functions |
| 199 | |
| 200 | Instead Of: Use: |
| 201 | |
| 202 | atof(s) Atof(s) |
| 203 | atol(s) Atol(s) |
| 204 | strtod(s, &p) Nothing. Just don't use it. |
| 205 | strtol(s, &p, n) Strtol(s, &p, n) |
| 206 | strtoul(s, &p, n) Strtoul(s, &p, n) |
| 207 | |
| 208 | Notice also the C<grok_bin>, C<grok_hex>, and C<grok_oct> functions in |
| 209 | F<numeric.c> for converting strings representing numbers in the respective |
| 210 | bases into C<NV>s. |
| 211 | |
| 212 | In theory C<Strtol> and C<Strtoul> may not be defined if the machine perl is |
| 213 | built on doesn't actually have strtol and strtoul. But as those 2 |
| 214 | functions are part of the 1989 ANSI C spec we suspect you'll find them |
| 215 | everywhere by now. |
| 216 | |
| 217 | int rand() double Drand01() |
| 218 | srand(n) { seedDrand01((Rand_seed_t)n); |
| 219 | PL_srand_called = TRUE; } |
| 220 | |
| 221 | exit(n) my_exit(n) |
| 222 | system(s) Don't. Look at pp_system or use my_popen |
| 223 | |
| 224 | getenv(s) PerlEnv_getenv(s) |
| 225 | setenv(s, val) my_putenv(s, val) |
| 226 | |
| 227 | =head2 Miscellaneous functions |
| 228 | |
| 229 | You should not even B<want> to use F<setjmp.h> functions, but if you |
| 230 | think you do, use the C<JMPENV> stack in F<scope.h> instead. |
| 231 | |
| 232 | For C<signal>/C<sigaction>, use C<rsignal(signo, handler)>. |
| 233 | |
| 234 | =head1 SEE ALSO |
| 235 | |
| 236 | L<perlapi>, L<perlapio>, L<perlguts> |
| 237 | |