X-Git-Url: https://perl5.git.perl.org/perl5.git/blobdiff_plain/11e1c8f2799d09ead03e234a9f91269f8b6b179b..b0a2be78d1e39d7609b90d53c8fffeee3363fb76:/pod/perliol.pod diff --git a/pod/perliol.pod b/pod/perliol.pod index ac6a4a2..466959b 100644 --- a/pod/perliol.pod +++ b/pod/perliol.pod @@ -1,4 +1,3 @@ - =head1 NAME perliol - C API for Perl's implementation of IO in Layers. @@ -8,7 +7,6 @@ perliol - C API for Perl's implementation of IO in Layers. /* Defining a layer ... */ #include - =head1 DESCRIPTION This document describes the behavior and implementation of the PerlIO @@ -19,13 +17,64 @@ C is not). The PerlIO abstraction was introduced in perl5.003_02 but languished as just an abstraction until perl5.7.0. However during that time a number -of perl extentions switched to using it, so the API is mostly fixed to +of perl extensions switched to using it, so the API is mostly fixed to maintain (source) compatibility. The aim of the implementation is to provide the PerlIO API in a flexible and platform neutral manner. It is also a trial of an "Object Oriented C, with vtables" approach which may be applied to perl6. +=head2 Basic Structure + +PerlIO is a stack of layers. + +The low levels of the stack work with the low-level operating system +calls (file descriptors in C) getting bytes in and out, the higher +layers of the stack buffer, filter, and otherwise manipulate the I/O, +and return characters (or bytes) to Perl. Terms I and I +are used to refer to the relative positioning of the stack layers. + +A layer contains a "vtable", the table of I/O operations (at C level +a table of function pointers), and status flags. The functions in the +vtable implement operations like "open", "read", and "write". + +When I/O, for example "read", is requested, the request goes from Perl +first down the stack using "read" functions of each layer, then at the +bottom the input is requested from the operating system services, then +the result is returned up the stack, finally being interpreted as Perl +data. + +The requests do not necessarily go always all the way down to the +operating system: that's where PerlIO buffering comes into play. + +When you do an open() and specify extra PerlIO layers to be deployed, +the layers you specify are "pushed" on top of the already existing +default stack. One way to see it is that "operating system is +on the left" and "Perl is on the right". + +What exact layers are in this default stack depends on a lot of +things: your operating system, Perl version, Perl compile time +configuration, and Perl runtime configuration. See L, +L, and L for more information. + +binmode() operates similarly to open(): by default the specified +layers are pushed on top of the existing stack. + +However, note that even as the specified layers are "pushed on top" +for open() and binmode(), this doesn't mean that the effects are +limited to the "top": PerlIO layers can be very 'active' and inspect +and affect layers also deeper in the stack. As an example there +is a layer called "raw" which repeatedly "pops" layers until +it reaches the first layer that has declared itself capable of +handling binary data. The "pushed" layers are processed in left-to-right +order. + +sysopen() operates (unsurprisingly) at a lower level in the stack than +open(). For example in UNIX or UNIX-like systems sysopen() operates +directly at the level of file descriptors: in the terms of PerlIO +layers, it uses only the "unix" layer, which is a rather thin wrapper +on top of the UNIX file descriptors. + =head2 Layers vs Disciplines Initial discussion of the ability to modify IO streams behaviour used @@ -34,9 +83,9 @@ believe) from the use of the term in "sfio", which in turn borrowed it from "line disciplines" on Unix terminals. However, this document (and the C code) uses the term "layer". -This is, I hope, a natural term given the implementation, and should avoid -connotations that are inherent in earlier uses of "discipline" for things -which are rather different. +This is, I hope, a natural term given the implementation, and should +avoid connotations that are inherent in earlier uses of "discipline" +for things which are rather different. =head2 Data Structures @@ -53,25 +102,30 @@ The basic data structure is a PerlIOl: IV flags; /* Various flags for state */ }; -A C is a pointer to to the struct, and the I level -C is a pointer to a C - i.e. a pointer to a pointer to -the struct. This allows the application level C to remain -constant while the actual C underneath changes. (Compare perl's -C which remains constant while its C field changes as the -scalar's type changes.) An IO stream is then in general represented as a -pointer to this linked-list of "layers". +A C is a pointer to the struct, and the I +level C is a pointer to a C - i.e. a pointer +to a pointer to the struct. This allows the application level C +to remain constant while the actual C underneath +changes. (Compare perl's C which remains constant while its +C field changes as the scalar's type changes.) An IO stream is +then in general represented as a pointer to this linked-list of +"layers". It should be noted that because of the double indirection in a C, -a C<< &(perlio-Enext) >> "is" a C, and so to some degree +a C<< &(perlio->next) >> "is" a C, and so to some degree at least one layer can use the "standard" API on the next layer down. A "layer" is composed of two parts: =over 4 -=item 1. The functions and attributes of the "layer class". +=item 1. + +The functions and attributes of the "layer class". + +=item 2. -=item 2. The per-instance data for a particular handle. +The per-instance data for a particular handle. =back @@ -82,67 +136,83 @@ member of C. The functions (methods of the layer "class") are fixed, and are defined by the C type. They are broadly the same as the public C functions: - struct _PerlIO_funcs - { - char * name; - Size_t size; - IV kind; - IV (*Fileno)(PerlIO *f); - PerlIO * (*Fdopen)(PerlIO_funcs *tab, int fd, const char *mode); - PerlIO * (*Open)(PerlIO_funcs *tab, const char *path, const char *mode); - int (*Reopen)(const char *path, const char *mode, PerlIO *f); - IV (*Pushed)(PerlIO *f,const char *mode,const char *arg,STRLEN len); - IV (*Popped)(PerlIO *f); - /* Unix-like functions - cf sfio line disciplines */ - SSize_t (*Read)(PerlIO *f, void *vbuf, Size_t count); - SSize_t (*Unread)(PerlIO *f, const void *vbuf, Size_t count); - SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count); - IV (*Seek)(PerlIO *f, Off_t offset, int whence); - Off_t (*Tell)(PerlIO *f); - IV (*Close)(PerlIO *f); - /* Stdio-like buffered IO functions */ - IV (*Flush)(PerlIO *f); - IV (*Fill)(PerlIO *f); - IV (*Eof)(PerlIO *f); - IV (*Error)(PerlIO *f); - void (*Clearerr)(PerlIO *f); - void (*Setlinebuf)(PerlIO *f); - /* Perl's snooping functions */ - STDCHAR * (*Get_base)(PerlIO *f); - Size_t (*Get_bufsiz)(PerlIO *f); - STDCHAR * (*Get_ptr)(PerlIO *f); - SSize_t (*Get_cnt)(PerlIO *f); - void (*Set_ptrcnt)(PerlIO *f,STDCHAR *ptr,SSize_t cnt); - }; - -The first few members of the struct give a "name" for the layer, the -size to C for the per-instance data, and some flags which are -attributes of the class as whole (such as whether it is a buffering + struct _PerlIO_funcs + { + Size_t fsize; + char * name; + Size_t size; + IV kind; + IV (*Pushed)(pTHX_ PerlIO *f,const char *mode,SV *arg, PerlIO_funcs *tab); + IV (*Popped)(pTHX_ PerlIO *f); + PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, + AV *layers, IV n, + const char *mode, + int fd, int imode, int perm, + PerlIO *old, + int narg, SV **args); + IV (*Binmode)(pTHX_ PerlIO *f); + SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags) + IV (*Fileno)(pTHX_ PerlIO *f); + PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags) + /* Unix-like functions - cf sfio line disciplines */ + SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); + SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); + SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); + IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); + Off_t (*Tell)(pTHX_ PerlIO *f); + IV (*Close)(pTHX_ PerlIO *f); + /* Stdio-like buffered IO functions */ + IV (*Flush)(pTHX_ PerlIO *f); + IV (*Fill)(pTHX_ PerlIO *f); + IV (*Eof)(pTHX_ PerlIO *f); + IV (*Error)(pTHX_ PerlIO *f); + void (*Clearerr)(pTHX_ PerlIO *f); + void (*Setlinebuf)(pTHX_ PerlIO *f); + /* Perl's snooping functions */ + STDCHAR * (*Get_base)(pTHX_ PerlIO *f); + Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); + STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); + SSize_t (*Get_cnt)(pTHX_ PerlIO *f); + void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt); + }; + +The first few members of the struct give a function table size for +compatibility check "name" for the layer, the size to C for the per-instance data, +and some flags which are attributes of the class as whole (such as whether it is a buffering layer), then follow the functions which fall into four basic groups: =over 4 -=item 1. Opening and setup functions +=item 1. + +Opening and setup functions -=item 2. Basic IO operations +=item 2. -=item 3. Stdio class buffering options. +Basic IO operations -=item 4. Functions to support Perl's traditional "fast" access to the buffer. +=item 3. + +Stdio class buffering options. + +=item 4. + +Functions to support Perl's traditional "fast" access to the buffer. =back -A layer does not have to implement all the functions, but the whole table has -to be present. Unimplemented slots can be NULL (which will will result in an error -when called) or can be filled in with stubs to "inherit" behaviour from -a "base class". This "inheritance" is fixed for all instances of the layer, -but as the layer chooses which stubs to populate the table, limited -"multiple inheritance" is possible. +A layer does not have to implement all the functions, but the whole +table has to be present. Unimplemented slots can be NULL (which will +result in an error when called) or can be filled in with stubs to +"inherit" behaviour from a "base class". This "inheritance" is fixed +for all instances of the layer, but as the layer chooses which stubs +to populate the table, limited "multiple inheritance" is possible. =head2 Per-instance Data -The per-instance data are held in memory beyond the basic PerlIOl struct, -by making a PerlIOl the first member of the layer's struct thus: +The per-instance data are held in memory beyond the basic PerlIOl +struct, by making a PerlIOl the first member of the layer's struct +thus: typedef struct { @@ -155,8 +225,8 @@ by making a PerlIOl the first member of the layer's struct thus: IV oneword; /* Emergency buffer */ } PerlIOBuf; -In this way (as for perl's scalars) a pointer to a PerlIOBuf can be treated -as a pointer to a PerlIOl. +In this way (as for perl's scalars) a pointer to a PerlIOBuf can be +treated as a pointer to a PerlIOl. =head2 Layers in action. @@ -191,32 +261,33 @@ dynamically) with a "socket" layer. =item * -Different handles can have different buffering schemes. The "top" layer -could be the "mmap" layer if reading disk files was quicker using C -than C. An "unbuffered" stream can be implemented simply by -not having a buffer layer. +Different handles can have different buffering schemes. The "top" +layer could be the "mmap" layer if reading disk files was quicker +using C than C. An "unbuffered" stream can be implemented +simply by not having a buffer layer. =item * Extra layers can be inserted to process the data as it flows through. This was the driving need for including the scheme in perl 5.7.0+ - we -needed a mechanism to allow data to be translated bewteen perl's +needed a mechanism to allow data to be translated between perl's internal encoding (conceptually at least Unicode as UTF-8), and the "native" format used by the system. This is provided by the ":encoding(xxxx)" layer which typically sits above the buffering layer. =item * -A layer can be added that does "\n" to CRLF translation. This layer can be used -on any platform, not just those that normally do such things. +A layer can be added that does "\n" to CRLF translation. This layer +can be used on any platform, not just those that normally do such +things. =back =head2 Per-instance flag bits -The generic flag bits are a hybrid of C style flags deduced from -the mode string passed to C, and state bits for typical buffer -layers. +The generic flag bits are a hybrid of C style flags deduced +from the mode string passed to C, and state bits for +typical buffer layers. =over 4 @@ -234,7 +305,7 @@ Reads are permitted i.e. opened "r" or "w+" (or even "a+" - ick). =item PERLIO_F_ERROR -An error has occured (for C) +An error has occurred (for C). =item PERLIO_F_TRUNCATE @@ -291,10 +362,10 @@ Handle is open. This instance of this layer supports the "fast C" interface. Normally set based on C for the class and by the -existance of the function(s) in the table. However a class that +existence of the function(s) in the table. However a class that normally provides that interface may need to avoid it on a particular instance. The "pending" layer needs to do this when -it is pushed above an layer which does not support the interface. +it is pushed above a layer which does not support the interface. (Perl's C does not expect the streams fast C behaviour to change during one "get".) @@ -304,126 +375,357 @@ to change during one "get".) =over 4 -=item IV (*Fileno)(PerlIO *f); +=item fsize + + Size_t fsize; + +Size of the function table. This is compared against the value PerlIO +code "knows" as a compatibility check. Future versions I be able +to tolerate layers compiled against an old version of the headers. + +=item name + + char * name; + +The name of the layer whose open() method Perl should invoke on +open(). For example if the layer is called APR, you will call: + + open $fh, ">:APR", ... + +and Perl knows that it has to invoke the PerlIOAPR_open() method +implemented by the APR layer. + +=item size + + Size_t size; + +The size of the per-instance data structure, e.g.: + + sizeof(PerlIOAPR) + +If this field is zero then C does not malloc anything +and assumes layer's Pushed function will do any required layer stack +manipulation - used to avoid malloc/free overhead for dummy layers. +If the field is non-zero it must be at least the size of C, +C will allocate memory for the layer's data structures +and link new layer onto the stream's stack. (If the layer's Pushed +method returns an error indication the layer is popped again.) + +=item kind + + IV kind; + +=over 4 + +=item * PERLIO_K_BUFFERED + +The layer is buffered. + +=item * PERLIO_K_RAW + +The layer is acceptable to have in a binmode(FH) stack - i.e. it does not +(or will configure itself not to) transform bytes passing through it. + +=item * PERLIO_K_CANCRLF + +Layer can translate between "\n" and CRLF line ends. + +=item * PERLIO_K_FASTGETS -Returns the Unix/Posix numeric file decriptor for the handle. Normally +Layer allows buffer snooping. + +=item * PERLIO_K_MULTIARG + +Used when the layer's open() accepts more arguments than usual. The +extra arguments should come not before the C argument. When this +flag is used it's up to the layer to validate the args. + +=back + +=item Pushed + + IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg); + +The only absolutely mandatory method. Called when the layer is pushed +onto the stack. The C argument may be NULL if this occurs +post-open. The C will be non-C if an argument string was +passed. In most cases this should call C to +convert C into the appropriate C flags in +addition to any actions the layer itself takes. If a layer is not +expecting an argument it need neither save the one passed to it, nor +provide C (it could perhaps C that the argument +was un-expected). + +Returns 0 on success. On failure returns -1 and should set errno. + +=item Popped + + IV (*Popped)(pTHX_ PerlIO *f); + +Called when the layer is popped from the stack. A layer will normally +be popped after C is called. But a layer can be popped +without being closed if the program is dynamically managing layers on +the stream. In such cases C should free any resources +(buffers, translation tables, ...) not held directly in the layer's +struct. It should also C any unconsumed data that has been +read and buffered from the layer below back to that layer, so that it +can be re-provided to what ever is now above. + +Returns 0 on success and failure. If C returns I then +I assumes that either the layer has popped itself, or the +layer is super special and needs to be retained for other reasons. +In most cases it should return I. + +=item Open + + PerlIO * (*Open)(...); + +The C method has lots of arguments because it combines the +functions of perl's C, C, perl's C, +C and C. The full prototype is as +follows: + + PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, + AV *layers, IV n, + const char *mode, + int fd, int imode, int perm, + PerlIO *old, + int narg, SV **args); + +Open should (perhaps indirectly) call C to allocate +a slot in the table and associate it with the layers information for +the opened file, by calling C. The I AV is an +array of all the layers destined for the C, and any +arguments passed to them, I is the index into that array of the +layer being called. The macro C will return a (possibly +C) SV * for the argument passed to the layer. + +The I string is an "C-like" string which would match +the regular expression C. + +The C<'I'> prefix is used during creation of C..C via +special C calls; the C<'#'> prefix means that this is +C and that I and I should be passed to +C; C<'r'> means Bead, C<'w'> means Brite and +C<'a'> means Bppend. The C<'+'> suffix means that both reading and +writing/appending are permitted. The C<'b'> suffix means file should +be binary, and C<'t'> means it is text. (Almost all layers should do +the IO in binary mode, and ignore the b/t bits. The C<:crlf> layer +should be pushed to handle the distinction.) + +If I is not C then this is a C. Perl itself +does not use this (yet?) and semantics are a little vague. + +If I not negative then it is the numeric file descriptor I, +which will be open in a manner compatible with the supplied mode +string, the call is thus equivalent to C. In this case +I will be zero. + +If I is greater than zero then it gives the number of arguments +passed to C, otherwise it will be 1 if for example +C was called. In simple cases SvPV_nolen(*args) is the +pathname to open. + +Having said all that translation-only layers do not need to provide +C at all, but rather leave the opening to a lower level layer +and wait to be "pushed". If a layer does provide C it should +normally call the C method of next layer down (if any) and +then push itself on top if that succeeds. + +If C was performed and open has failed, it must +C itself, since if it's not, the layer won't be removed +and may cause bad problems. + +Returns C on failure. + +=item Binmode + + IV (*Binmode)(pTHX_ PerlIO *f); + +Optional. Used when C<:raw> layer is pushed (explicitly or as a result +of binmode(FH)). If not present layer will be popped. If present +should configure layer as binary (or pop itself) and return 0. +If it returns -1 for error C will fail with layer +still on the stack. + +=item Getarg + + SV * (*Getarg)(pTHX_ PerlIO *f, + CLONE_PARAMS *param, int flags); + +Optional. If present should return an SV * representing the string +argument passed to the layer when it was +pushed. e.g. ":encoding(ascii)" would return an SvPV with value +"ascii". (I and I arguments can be ignored in most +cases) + +=item Fileno + + IV (*Fileno)(pTHX_ PerlIO *f); + +Returns the Unix/Posix numeric file descriptor for the handle. Normally C (which just asks next layer down) will suffice for this. -=item PerlIO * (*Fdopen)(PerlIO_funcs *tab, int fd, const char *mode); +Returns -1 on error, which is considered to include the case where the +layer cannot provide such a file descriptor. -Should (perhaps indirectly) call C to allocate a slot -in the table and associate it with the given numeric file descriptor, -which will be open in an manner compatible with the supplied mode string. +=item Dup -=item PerlIO * (*Open)(PerlIO_funcs *tab, const char *path, const char *mode); + PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, + CLONE_PARAMS *param, int flags); -Should attempt to open the given path and if that succeeds then (perhaps -indirectly) call C to allocate a slot in the table and -associate it with the layers information for the opened file. +XXX: Needs more docs. -=item int (*Reopen)(const char *path, const char *mode, PerlIO *f); +Used as part of the "clone" process when a thread is spawned (in which +case param will be non-NULL) and when a stream is being duplicated via +'&' in the C. -Re-open the supplied C to connect it to C in C. -Returns as success flag. Perl does not use this and L marks it -as subject to change. +Similar to C, returns PerlIO* on success, C on failure. -=item IV (*Pushed)(PerlIO *f,const char *mode,const char *arg,STRLEN len); +=item Read -Called when the layer is pushed onto the stack. The C argument may -be NULL if this occurs post-open. The C and C will be present -if an argument string was passed. In most cases this should call -C to convert C into the appropriate -C flags in addition to any actions the layer itself takes. + SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); -=item IV (*Popped)(PerlIO *f); +Basic read operation. -Called when the layer is popped from the stack. A layer will normally be -popped after C is called. But a layer can be popped without being -closed if the program is dynamically managing layers on the stream. In -such cases C should free any resources (buffers, translation -tables, ...) not held directly in the layer's struct. +Typically will call C and manipulate pointers (possibly via the +API). C may be suitable for derived classes which +provide "fast gets" methods. -=item SSize_t (*Read)(PerlIO *f, void *vbuf, Size_t count); +Returns actual bytes read, or -1 on an error. -Basic read operation. Returns actual bytes read, or -1 on an error. -Typically will call Fill and manipulate pointers (possibly via the API). -C may be suitable for derived classes which provide -"fast gets" methods. +=item Unread -=item SSize_t (*Unread)(PerlIO *f, const void *vbuf, Size_t count); + SSize_t (*Unread)(pTHX_ PerlIO *f, + const void *vbuf, Size_t count); A superset of stdio's C. Should arrange for future reads to see the bytes in C. If there is no obviously better implementation then C provides the function by pushing a "fake" "pending" layer above the calling layer. -=item SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count); +Returns the number of unread chars. + +=item Write + + SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count); + +Basic write operation. + +Returns bytes written or -1 on an error. -Basic write operation. Returns bytes written or -1 on an error. +=item Seek -=item IV (*Seek)(PerlIO *f, Off_t offset, int whence); + IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); -Position the file pointer. Should normally call its own C method and -then the C method of next layer down. +Position the file pointer. Should normally call its own C +method and then the C method of next layer down. -=item Off_t (*Tell)(PerlIO *f); +Returns 0 on success, -1 on failure. + +=item Tell + + Off_t (*Tell)(pTHX_ PerlIO *f); Return the file pointer. May be based on layers cached concept of position to avoid overhead. -=item IV (*Close)(PerlIO *f); +Returns -1 on failure to get the file pointer. + +=item Close + + IV (*Close)(pTHX_ PerlIO *f); Close the stream. Should normally call C to flush itself and close layers below, and then deallocate any data structures (buffers, translation tables, ...) not held directly in the data structure. -=item IV (*Flush)(PerlIO *f); +Returns 0 on success, -1 on failure. + +=item Flush + + IV (*Flush)(pTHX_ PerlIO *f); Should make stream's state consistent with layers below. That is, any buffered write data should be written, and file position of lower layers -adjusted for data read fron below but not actually consumed. +adjusted for data read from below but not actually consumed. +(Should perhaps C such data to the lower layer.) + +Returns 0 on success, -1 on failure. + +=item Fill -=item IV (*Fill)(PerlIO *f); + IV (*Fill)(pTHX_ PerlIO *f); -The buffer for this layer should be filled (for read) from layer below. +The buffer for this layer should be filled (for read) from layer +below. When you "subclass" PerlIOBuf layer, you want to use its +I<_read> method and to supply your own fill method, which fills the +PerlIOBuf's buffer. -=item IV (*Eof)(PerlIO *f); +Returns 0 on success, -1 on failure. + +=item Eof + + IV (*Eof)(pTHX_ PerlIO *f); Return end-of-file indicator. C is normally sufficient. -=item IV (*Error)(PerlIO *f); +Returns 0 on end-of-file, 1 if not end-of-file, -1 on error. + +=item Error + + IV (*Error)(pTHX_ PerlIO *f); Return error indicator. C is normally sufficient. -=item void (*Clearerr)(PerlIO *f); +Returns 1 if there is an error (usually when C is set, +0 otherwise. + +=item Clearerr + + void (*Clearerr)(pTHX_ PerlIO *f); Clear end-of-file and error indicators. Should call C to set the C flags, which may suffice. -=item void (*Setlinebuf)(PerlIO *f); +=item Setlinebuf + + void (*Setlinebuf)(pTHX_ PerlIO *f); -Mark the stream as line buffered. +Mark the stream as line buffered. C sets the +PERLIO_F_LINEBUF flag and is normally sufficient. -=item STDCHAR * (*Get_base)(PerlIO *f); +=item Get_base + + STDCHAR * (*Get_base)(pTHX_ PerlIO *f); Allocate (if not already done so) the read buffer for this layer and -return pointer to it. +return pointer to it. Return NULL on failure. + +=item Get_bufsiz -=item Size_t (*Get_bufsiz)(PerlIO *f); + Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); Return the number of bytes that last C put in the buffer. -=item STDCHAR * (*Get_ptr)(PerlIO *f); +=item Get_ptr + + STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); Return the current read pointer relative to this layer's buffer. -=item SSize_t (*Get_cnt)(PerlIO *f); +=item Get_cnt + + SSize_t (*Get_cnt)(pTHX_ PerlIO *f); Return the number of bytes left to be read in the current buffer. -=item void (*Set_ptrcnt)(PerlIO *f,STDCHAR *ptr,SSize_t cnt); +=item Set_ptrcnt + + void (*Set_ptrcnt)(pTHX_ PerlIO *f, + STDCHAR *ptr, SSize_t cnt); Adjust the read pointer and count of bytes to match C and/or C. The application (or layer above) must ensure they are consistent. @@ -431,6 +733,110 @@ The application (or layer above) must ensure they are consistent. =back +=head2 Utilities + +To ask for the next layer down use PerlIONext(PerlIO *f). + +To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All +this does is really just to check that the pointer is non-NULL and +that the pointer behind that is non-NULL.) + +PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words, +the C pointer. + +PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type. + +Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either +calls the I from the functions of the layer I (just by +the name of the IO function, like "Read") with the I, or if +there is no such callback, calls the I version of the callback +with the same args, or if the f is invalid, set errno to EBADF and +return I. + +Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls +the I of the functions of the layer I with the I, +or if there is no such callback, set errno to EINVAL. Or if the f is +invalid, set errno to EBADF and return I. + +Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls +the I of the functions of the layer I with the I, +or if there is no such callback, calls the I version of the +callback with the same args, or if the f is invalid, set errno to +EBADF. + +Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the +I of the functions of the layer I with the I, or if +there is no such callback, set errno to EINVAL. Or if the f is +invalid, set errno to EBADF. + +=head2 Implementing PerlIO Layers + +If you find the implementation document unclear or not sufficient, +look at the existing PerlIO layer implementations, which include: + +=over + +=item * C implementations + +The F and F in the Perl core implement the +"unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending" +layers, and also the "mmap" and "win32" layers if applicable. +(The "win32" is currently unfinished and unused, to see what is used +instead in Win32, see L .) + +PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core. + +PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN. + +=item * Perl implementations + +PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN. + +=back + +If you are creating a PerlIO layer, you may want to be lazy, in other +words, implement only the methods that interest you. The other methods +you can either replace with the "blank" methods + + PerlIOBase_noop_ok + PerlIOBase_noop_fail + +(which do nothing, and return zero and -1, respectively) or for +certain methods you may assume a default behaviour by using a NULL +method. The Open method looks for help in the 'parent' layer. +The following table summarizes the behaviour: + + method behaviour with NULL + + Clearerr PerlIOBase_clearerr + Close PerlIOBase_close + Dup PerlIOBase_dup + Eof PerlIOBase_eof + Error PerlIOBase_error + Fileno PerlIOBase_fileno + Fill FAILURE + Flush SUCCESS + Getarg SUCCESS + Get_base FAILURE + Get_bufsiz FAILURE + Get_cnt FAILURE + Get_ptr FAILURE + Open INHERITED + Popped SUCCESS + Pushed SUCCESS + Read PerlIOBase_read + Seek FAILURE + Set_cnt FAILURE + Set_ptrcnt FAILURE + Setlinebuf PerlIOBase_setlinebuf + Tell FAILURE + Unread PerlIOBase_unread + Write FAILURE + + FAILURE Set errno (to EINVAL in UNIXish, to LIB$_INVARG in VMS) and + return -1 (for numeric return values) or NULL (for pointers) + INHERITED Inherited from the layer below + SUCCESS Return 0 (for numeric return values) or a pointer =head2 Core Layers @@ -448,13 +854,13 @@ between O_TEXT and O_BINARY this layer is always O_BINARY. A very complete generic buffering layer which provides the whole of PerlIO API. It is also intended to be used as a "base class" for other -layers. (For example its C method is implemented in terms of the -C/C/C methods). +layers. (For example its C method is implemented in terms of +the C/C/C methods). "perlio" over "unix" provides a complete replacement for stdio as seen via PerlIO API. This is the default for USE_PERLIO when system's stdio -does not permit perl's "fast gets" access, and which do not distinguish -between C and C. +does not permit perl's "fast gets" access, and which do not +distinguish between C and C. =item "stdio" @@ -485,21 +891,24 @@ minimalist "derived" layer. =item "pending" An "internal" derivative of "perlio" which can be used to provide -Unread() function for layers which have no buffer or cannot be bothered. -(Basically this layer's C pops itself off the stack and so resumes -reading from layer below.) +Unread() function for layers which have no buffer or cannot be +bothered. (Basically this layer's C pops itself off the stack +and so resumes reading from layer below.) =item "raw" A dummy layer which never exists on the layer stack. Instead when -"pushed" it actually pops the stack(!), removing itself, and any other -layers until it reaches a layer with the class C bit set. +"pushed" it actually pops the stack removing itself, it then calls +Binmode function table entry on all the layers in the stack - normally +this (via PerlIOBase_binmode) removes any layers which do not have +C bit set. Layers can modify that behaviour by defining +their own Binmode entry. =item "utf8" Another dummy layer. When pushed it pops itself and sets the -C flag on the layer which was (and now is once more) the top -of the stack. +C flag on the layer which was (and now is once more) +the top of the stack. =back @@ -509,23 +918,118 @@ which do not need to do anything special for a particular method. =head2 Extension Layers -Layers can made available by extension modules. +Layers can made available by extension modules. When an unknown layer +is encountered the PerlIO code will perform the equivalent of : + + use PerlIO 'layer'; + +Where I is the unknown layer. F will then attempt to: + + require PerlIO::layer; + +If after that process the layer is still not defined then the C +will fail. + +The following extension layers are bundled with perl: =over 4 -=item "encoding" +=item ":encoding" use Encoding; -makes this layer available. It is an example of a layer which takes an argument. -as it is called as: +makes this layer available, although F "knows" where to +find it. It is an example of a layer which takes an argument as it is +called thus: + + open( $fh, "<:encoding(iso-8859-7)", $pathname ); + +=item ":scalar" + +Provides support for reading data from and writing data to a scalar. + + open( $fh, "+<:scalar", \$scalar ); + +When a handle is so opened, then reads get bytes from the string value +of I<$scalar>, and writes change the value. In both cases the position +in I<$scalar> starts as zero but can be altered via C, and +determined via C. + +Please note that this layer is implied when calling open() thus: + + open( $fh, "+<", \$scalar ); - open($fh,"<:encoding(iso-8859-7)",$pathname) +=item ":via" + +Provided to allow layers to be implemented as Perl code. For instance: + + use PerlIO::via::StripHTML; + open( my $fh, "<:via(StripHTML)", "index.html" ); + +See L for details. =back +=head1 TODO -=cut +Things that need to be done to improve this document. + +=over + +=item * + +Explain how to make a valid fh without going through open()(i.e. apply +a layer). For example if the file is not opened through perl, but we +want to get back a fh, like it was opened by Perl. +How PerlIO_apply_layera fits in, where its docs, was it made public? +Currently the example could be something like this: + PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...) + { + char *mode; /* "w", "r", etc */ + const char *layers = ":APR"; /* the layer name */ + PerlIO *f = PerlIO_allocate(aTHX); + if (!f) { + return NULL; + } + + PerlIO_apply_layers(aTHX_ f, mode, layers); + + if (f) { + PerlIOAPR *st = PerlIOSelf(f, PerlIOAPR); + /* fill in the st struct, as in _open() */ + st->file = file; + PerlIOBase(f)->flags |= PERLIO_F_OPEN; + + return f; + } + return NULL; + } + +=item * + +fix/add the documentation in places marked as XXX. + +=item * + +The handling of errors by the layer is not specified. e.g. when $! +should be set explicitly, when the error handling should be just +delegated to the top layer. + +Probably give some hints on using SETERRNO() or pointers to where they +can be found. + +=item * + +I think it would help to give some concrete examples to make it easier +to understand the API. Of course I agree that the API has to be +concise, but since there is no second document that is more of a +guide, I think that it'd make it easier to start with the doc which is +an API, but has examples in it in places where things are unclear, to +a person who is not a PerlIO guru (yet). + +=back + +=cut