X-Git-Url: https://perl5.git.perl.org/perl5.git/blobdiff_plain/15af043884e0520355045b5d53efce3cdf6f3094..b81c91435ed95d8d84d43371422272a3f22ebd7d:/pod/perliol.pod diff --git a/pod/perliol.pod b/pod/perliol.pod index 81cbab1..0f71b93 100644 --- a/pod/perliol.pod +++ b/pod/perliol.pod @@ -22,7 +22,58 @@ maintain (source) compatibility. The aim of the implementation is to provide the PerlIO API in a flexible and platform neutral manner. It is also a trial of an "Object Oriented -C, with vtables" approach which may be applied to perl6. +C, with vtables" approach which may be applied to Perl 6. + +=head2 Basic Structure + +PerlIO is a stack of layers. + +The low levels of the stack work with the low-level operating system +calls (file descriptors in C) getting bytes in and out, the higher +layers of the stack buffer, filter, and otherwise manipulate the I/O, +and return characters (or bytes) to Perl. Terms I and I +are used to refer to the relative positioning of the stack layers. + +A layer contains a "vtable", the table of I/O operations (at C level +a table of function pointers), and status flags. The functions in the +vtable implement operations like "open", "read", and "write". + +When I/O, for example "read", is requested, the request goes from Perl +first down the stack using "read" functions of each layer, then at the +bottom the input is requested from the operating system services, then +the result is returned up the stack, finally being interpreted as Perl +data. + +The requests do not necessarily go always all the way down to the +operating system: that's where PerlIO buffering comes into play. + +When you do an open() and specify extra PerlIO layers to be deployed, +the layers you specify are "pushed" on top of the already existing +default stack. One way to see it is that "operating system is +on the left" and "Perl is on the right". + +What exact layers are in this default stack depends on a lot of +things: your operating system, Perl version, Perl compile time +configuration, and Perl runtime configuration. See L, +L, and L for more information. + +binmode() operates similarly to open(): by default the specified +layers are pushed on top of the existing stack. + +However, note that even as the specified layers are "pushed on top" +for open() and binmode(), this doesn't mean that the effects are +limited to the "top": PerlIO layers can be very 'active' and inspect +and affect layers also deeper in the stack. As an example there +is a layer called "raw" which repeatedly "pops" layers until +it reaches the first layer that has declared itself capable of +handling binary data. The "pushed" layers are processed in left-to-right +order. + +sysopen() operates (unsurprisingly) at a lower level in the stack than +open(). For example in Unix or Unix-like systems sysopen() operates +directly at the level of file descriptors: in the terms of PerlIO +layers, it uses only the "unix" layer, which is a rather thin wrapper +on top of the Unix file descriptors. =head2 Layers vs Disciplines @@ -87,13 +138,14 @@ same as the public C functions: struct _PerlIO_funcs { + Size_t fsize; char * name; Size_t size; IV kind; - IV (*Pushed)(pTHX_ PerlIO *f,const char *mode,SV *arg); + IV (*Pushed)(pTHX_ PerlIO *f,const char *mode,SV *arg, PerlIO_funcs *tab); IV (*Popped)(pTHX_ PerlIO *f); PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, - AV *layers, IV n, + PerlIO_list_t *layers, IV n, const char *mode, int fd, int imode, int perm, PerlIO *old, @@ -124,9 +176,9 @@ same as the public C functions: void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt); }; -The first few members of the struct give a "name" for the layer, the -size to C for the per-instance data, and some flags which are -attributes of the class as whole (such as whether it is a buffering +The first few members of the struct give a function table size for +compatibility check "name" for the layer, the size to C for the per-instance data, +and some flags which are attributes of the class as whole (such as whether it is a buffering layer), then follow the functions which fall into four basic groups: =over 4 @@ -194,7 +246,7 @@ representing open (allocated) handles. For example the first three slots in the table correspond to C,C and C. The table in turn points to the current "top" layer for the handle - in this case an instance of the generic buffering layer "perlio". That layer in turn -points to the next layer down - in this case the lowlevel "unix" layer. +points to the next layer down - in this case the low-level "unix" layer. The above is roughly equivalent to a "stdio" buffered stream, but with much more flexibility: @@ -323,6 +375,14 @@ to change during one "get".) =over 4 +=item fsize + + Size_t fsize; + +Size of the function table. This is compared against the value PerlIO +code "knows" as a compatibility check. Future versions I be able +to tolerate layers compiled against an old version of the headers. + =item name char * name; @@ -343,6 +403,14 @@ The size of the per-instance data structure, e.g.: sizeof(PerlIOAPR) +If this field is zero then C does not malloc anything +and assumes layer's Pushed function will do any required layer stack +manipulation - used to avoid malloc/free overhead for dummy layers. +If the field is non-zero it must be at least the size of C, +C will allocate memory for the layer's data structures +and link new layer onto the stream's stack. (If the layer's Pushed +method returns an error indication the layer is popped again.) + =item kind IV kind; @@ -403,7 +471,10 @@ struct. It should also C any unconsumed data that has been read and buffered from the layer below back to that layer, so that it can be re-provided to what ever is now above. -Returns 0 on success and failure. +Returns 0 on success and failure. If C returns I then +I assumes that either the layer has popped itself, or the +layer is super special and needs to be retained for other reasons. +In most cases it should return I. =item Open @@ -415,7 +486,7 @@ C and C. The full prototype is as follows: PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, - AV *layers, IV n, + PerlIO_list_t *layers, IV n, const char *mode, int fd, int imode, int perm, PerlIO *old, @@ -423,7 +494,7 @@ follows: Open should (perhaps indirectly) call C to allocate a slot in the table and associate it with the layers information for -the opened file, by calling C. The I AV is an +the opened file, by calling C. The I is an array of all the layers destined for the C, and any arguments passed to them, I is the index into that array of the layer being called. The macro C will return a (possibly @@ -437,10 +508,10 @@ special C calls; the C<'#'> prefix means that this is C and that I and I should be passed to C; C<'r'> means Bead, C<'w'> means Brite and C<'a'> means Bppend. The C<'+'> suffix means that both reading and -writing/appending are permitted. The C<'b'> suffix means file should -be binary, and C<'t'> means it is text. (Binary/Text should be ignored -by almost all layers and binary IO done, with PerlIO. The C<:crlf> -layer should be pushed to handle the distinction.) +writing/appending are permitted. The C<'b'> suffix means file should +be binary, and C<'t'> means it is text. (Almost all layers should do +the IO in binary mode, and ignore the b/t bits. The C<:crlf> layer +should be pushed to handle the distinction.) If I is not C then this is a C. Perl itself does not use this (yet?) and semantics are a little vague. @@ -455,11 +526,16 @@ passed to C, otherwise it will be 1 if for example C was called. In simple cases SvPV_nolen(*args) is the pathname to open. -Having said all that translation-only layers do not need to provide -C at all, but rather leave the opening to a lower level layer -and wait to be "pushed". If a layer does provide C it should -normally call the C method of next layer down (if any) and -then push itself on top if that succeeds. +If a layer provides C it should normally call the C +method of next layer down (if any) and then push itself on top if that +succeeds. C is provided to do exactly that, so in +most cases you don't have to write your own C method. If this +method is not defined, other layers may have difficulty pushing +themselves on top of it during open. + +If C was performed and open has failed, it must +C itself, since if it's not, the layer won't be removed +and may cause bad problems. Returns C on failure. @@ -484,6 +560,10 @@ pushed. e.g. ":encoding(ascii)" would return an SvPV with value "ascii". (I and I arguments can be ignored in most cases) +C uses C to retrieve the argument originally passed to +C, so you must implement this function if your layer has an +extra argument to C and will ever be Ced. + =item Fileno IV (*Fileno)(pTHX_ PerlIO *f); @@ -492,18 +572,19 @@ Returns the Unix/Posix numeric file descriptor for the handle. Normally C (which just asks next layer down) will suffice for this. -Returns -1 if the layer cannot provide such a file descriptor, or in -the case of the error. - -XXX: two possible results end up in -1, one is an error the other is -not. +Returns -1 on error, which is considered to include the case where the +layer cannot provide such a file descriptor. =item Dup PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags); -XXX: not documented +XXX: Needs more docs. + +Used as part of the "clone" process when a thread is spawned (in which +case param will be non-NULL) and when a stream is being duplicated via +'&' in the C. Similar to C, returns PerlIO* on success, C on failure. @@ -657,6 +738,110 @@ The application (or layer above) must ensure they are consistent. =back +=head2 Utilities + +To ask for the next layer down use PerlIONext(PerlIO *f). + +To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All +this does is really just to check that the pointer is non-NULL and +that the pointer behind that is non-NULL.) + +PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words, +the C pointer. + +PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type. + +Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either +calls the I from the functions of the layer I (just by +the name of the IO function, like "Read") with the I, or if +there is no such callback, calls the I version of the callback +with the same args, or if the f is invalid, set errno to EBADF and +return I. + +Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls +the I of the functions of the layer I with the I, +or if there is no such callback, set errno to EINVAL. Or if the f is +invalid, set errno to EBADF and return I. + +Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls +the I of the functions of the layer I with the I, +or if there is no such callback, calls the I version of the +callback with the same args, or if the f is invalid, set errno to +EBADF. + +Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the +I of the functions of the layer I with the I, or if +there is no such callback, set errno to EINVAL. Or if the f is +invalid, set errno to EBADF. + +=head2 Implementing PerlIO Layers + +If you find the implementation document unclear or not sufficient, +look at the existing PerlIO layer implementations, which include: + +=over + +=item * C implementations + +The F and F in the Perl core implement the +"unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending" +layers, and also the "mmap" and "win32" layers if applicable. +(The "win32" is currently unfinished and unused, to see what is used +instead in Win32, see L .) + +PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core. + +PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN. + +=item * Perl implementations + +PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN. + +=back + +If you are creating a PerlIO layer, you may want to be lazy, in other +words, implement only the methods that interest you. The other methods +you can either replace with the "blank" methods + + PerlIOBase_noop_ok + PerlIOBase_noop_fail + +(which do nothing, and return zero and -1, respectively) or for +certain methods you may assume a default behaviour by using a NULL +method. The Open method looks for help in the 'parent' layer. +The following table summarizes the behaviour: + + method behaviour with NULL + + Clearerr PerlIOBase_clearerr + Close PerlIOBase_close + Dup PerlIOBase_dup + Eof PerlIOBase_eof + Error PerlIOBase_error + Fileno PerlIOBase_fileno + Fill FAILURE + Flush SUCCESS + Getarg SUCCESS + Get_base FAILURE + Get_bufsiz FAILURE + Get_cnt FAILURE + Get_ptr FAILURE + Open INHERITED + Popped SUCCESS + Pushed SUCCESS + Read PerlIOBase_read + Seek FAILURE + Set_cnt FAILURE + Set_ptrcnt FAILURE + Setlinebuf PerlIOBase_setlinebuf + Tell FAILURE + Unread PerlIOBase_unread + Write FAILURE + + FAILURE Set errno (to EINVAL in Unixish, to LIB$_INVARG in VMS) and + return -1 (for numeric return values) or NULL (for pointers) + INHERITED Inherited from the layer below + SUCCESS Return 0 (for numeric return values) or a pointer =head2 Core Layers @@ -762,27 +947,31 @@ makes this layer available, although F "knows" where to find it. It is an example of a layer which takes an argument as it is called thus: - open($fh,"<:encoding(iso-8859-7)",$pathname) + open( $fh, "<:encoding(iso-8859-7)", $pathname ); -=item ":Scalar" +=item ":scalar" -Provides support for +Provides support for reading data from and writing data to a scalar. - open($fh,"...",\$scalar) + open( $fh, "+<:scalar", \$scalar ); When a handle is so opened, then reads get bytes from the string value of I<$scalar>, and writes change the value. In both cases the position in I<$scalar> starts as zero but can be altered via C, and determined via C. -=item ":Via" +Please note that this layer is implied when calling open() thus: + + open( $fh, "+<", \$scalar ); + +=item ":via" Provided to allow layers to be implemented as Perl code. For instance: - use MIME::QuotedPrint; - open(my $fh, ">Via(MIME::QuotedPrint)", "qp"); + use PerlIO::via::StripHTML; + open( my $fh, "<:via(StripHTML)", "index.html" ); -See L for details. +See L for details. =back @@ -849,6 +1038,3 @@ a person who is not a PerlIO guru (yet). =back =cut - - -