=head1 DESCRIPTION
This document describes the behavior and implementation of the PerlIO
-abstraction described in L<perlapio> when C<USE_PERLIO> is defined (and
-C<USE_SFIO> is not).
+abstraction described in L<perlapio> when C<USE_PERLIO> is defined.
=head2 History and Background
The aim of the implementation is to provide the PerlIO API in a flexible
and platform neutral manner. It is also a trial of an "Object Oriented
-C, with vtables" approach which may be applied to perl6.
+C, with vtables" approach which may be applied to Perl 6.
+
+=head2 Basic Structure
+
+PerlIO is a stack of layers.
+
+The low levels of the stack work with the low-level operating system
+calls (file descriptors in C) getting bytes in and out, the higher
+layers of the stack buffer, filter, and otherwise manipulate the I/O,
+and return characters (or bytes) to Perl. Terms I<above> and I<below>
+are used to refer to the relative positioning of the stack layers.
+
+A layer contains a "vtable", the table of I/O operations (at C level
+a table of function pointers), and status flags. The functions in the
+vtable implement operations like "open", "read", and "write".
+
+When I/O, for example "read", is requested, the request goes from Perl
+first down the stack using "read" functions of each layer, then at the
+bottom the input is requested from the operating system services, then
+the result is returned up the stack, finally being interpreted as Perl
+data.
+
+The requests do not necessarily go always all the way down to the
+operating system: that's where PerlIO buffering comes into play.
+
+When you do an open() and specify extra PerlIO layers to be deployed,
+the layers you specify are "pushed" on top of the already existing
+default stack. One way to see it is that "operating system is
+on the left" and "Perl is on the right".
+
+What exact layers are in this default stack depends on a lot of
+things: your operating system, Perl version, Perl compile time
+configuration, and Perl runtime configuration. See L<PerlIO>,
+L<perlrun/PERLIO>, and L<open> for more information.
+
+binmode() operates similarly to open(): by default the specified
+layers are pushed on top of the existing stack.
+
+However, note that even as the specified layers are "pushed on top"
+for open() and binmode(), this doesn't mean that the effects are
+limited to the "top": PerlIO layers can be very 'active' and inspect
+and affect layers also deeper in the stack. As an example there
+is a layer called "raw" which repeatedly "pops" layers until
+it reaches the first layer that has declared itself capable of
+handling binary data. The "pushed" layers are processed in left-to-right
+order.
+
+sysopen() operates (unsurprisingly) at a lower level in the stack than
+open(). For example in Unix or Unix-like systems sysopen() operates
+directly at the level of file descriptors: in the terms of PerlIO
+layers, it uses only the "unix" layer, which is a rather thin wrapper
+on top of the Unix file descriptors.
=head2 Layers vs Disciplines
{
PerlIOl * next; /* Lower layer */
PerlIO_funcs * tab; /* Functions for this layer */
- IV flags; /* Various flags for state */
+ U32 flags; /* Various flags for state */
};
A C<PerlIOl *> is a pointer to the struct, and the I<application>
fixed, and are defined by the C<PerlIO_funcs> type. They are broadly the
same as the public C<PerlIO_xxxxx> functions:
- struct _PerlIO_funcs
- {
- Size_t fsize;
- char * name;
- Size_t size;
- IV kind;
- IV (*Pushed)(pTHX_ PerlIO *f,const char *mode,SV *arg, PerlIO_funcs *tab);
- IV (*Popped)(pTHX_ PerlIO *f);
- PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab,
- AV *layers, IV n,
- const char *mode,
- int fd, int imode, int perm,
- PerlIO *old,
- int narg, SV **args);
- IV (*Binmode)(pTHX_ PerlIO *f);
- SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags)
- IV (*Fileno)(pTHX_ PerlIO *f);
- PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags)
- /* Unix-like functions - cf sfio line disciplines */
- SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count);
- SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
- SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
- IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence);
- Off_t (*Tell)(pTHX_ PerlIO *f);
- IV (*Close)(pTHX_ PerlIO *f);
- /* Stdio-like buffered IO functions */
- IV (*Flush)(pTHX_ PerlIO *f);
- IV (*Fill)(pTHX_ PerlIO *f);
- IV (*Eof)(pTHX_ PerlIO *f);
- IV (*Error)(pTHX_ PerlIO *f);
- void (*Clearerr)(pTHX_ PerlIO *f);
- void (*Setlinebuf)(pTHX_ PerlIO *f);
- /* Perl's snooping functions */
- STDCHAR * (*Get_base)(pTHX_ PerlIO *f);
- Size_t (*Get_bufsiz)(pTHX_ PerlIO *f);
- STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f);
- SSize_t (*Get_cnt)(pTHX_ PerlIO *f);
- void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt);
- };
+ struct _PerlIO_funcs
+ {
+ Size_t fsize;
+ char * name;
+ Size_t size;
+ IV kind;
+ IV (*Pushed)(pTHX_ PerlIO *f,
+ const char *mode,
+ SV *arg,
+ PerlIO_funcs *tab);
+ IV (*Popped)(pTHX_ PerlIO *f);
+ PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab,
+ PerlIO_list_t *layers, IV n,
+ const char *mode,
+ int fd, int imode, int perm,
+ PerlIO *old,
+ int narg, SV **args);
+ IV (*Binmode)(pTHX_ PerlIO *f);
+ SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags)
+ IV (*Fileno)(pTHX_ PerlIO *f);
+ PerlIO * (*Dup)(pTHX_ PerlIO *f,
+ PerlIO *o,
+ CLONE_PARAMS *param,
+ int flags)
+ /* Unix-like functions - cf sfio line disciplines */
+ SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count);
+ SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
+ SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
+ IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence);
+ Off_t (*Tell)(pTHX_ PerlIO *f);
+ IV (*Close)(pTHX_ PerlIO *f);
+ /* Stdio-like buffered IO functions */
+ IV (*Flush)(pTHX_ PerlIO *f);
+ IV (*Fill)(pTHX_ PerlIO *f);
+ IV (*Eof)(pTHX_ PerlIO *f);
+ IV (*Error)(pTHX_ PerlIO *f);
+ void (*Clearerr)(pTHX_ PerlIO *f);
+ void (*Setlinebuf)(pTHX_ PerlIO *f);
+ /* Perl's snooping functions */
+ STDCHAR * (*Get_base)(pTHX_ PerlIO *f);
+ Size_t (*Get_bufsiz)(pTHX_ PerlIO *f);
+ STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f);
+ SSize_t (*Get_cnt)(pTHX_ PerlIO *f);
+ void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt);
+ };
The first few members of the struct give a function table size for
compatibility check "name" for the layer, the size to C<malloc> for the per-instance data,
in the table correspond to C<stdin>,C<stdout> and C<stderr>. The table
in turn points to the current "top" layer for the handle - in this case
an instance of the generic buffering layer "perlio". That layer in turn
-points to the next layer down - in this case the lowlevel "unix" layer.
+points to the next layer down - in this case the low-level "unix" layer.
The above is roughly equivalent to a "stdio" buffered stream, but with
much more flexibility:
=item Pushed
- IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg);
+ IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg);
The only absolutely mandatory method. Called when the layer is pushed
onto the stack. The C<mode> argument may be NULL if this occurs
follows:
PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab,
- AV *layers, IV n,
+ PerlIO_list_t *layers, IV n,
const char *mode,
int fd, int imode, int perm,
PerlIO *old,
Open should (perhaps indirectly) call C<PerlIO_allocate()> to allocate
a slot in the table and associate it with the layers information for
-the opened file, by calling C<PerlIO_push>. The I<layers> AV is an
+the opened file, by calling C<PerlIO_push>. The I<layers> is an
array of all the layers destined for the C<PerlIO *>, and any
arguments passed to them, I<n> is the index into that array of the
layer being called. The macro C<PerlIOArg> will return a (possibly
C<NULL>) SV * for the argument passed to the layer.
+Where a layer opens or takes ownership of a file descriptor, that layer is
+responsible for getting the file descriptor's close-on-exec flag into the
+correct state. The flag should be clear for a file descriptor numbered
+less than or equal to C<PL_maxsysfd>, and set for any file descriptor
+numbered higher. For thread safety, when a layer opens a new file
+descriptor it should if possible open it with the close-on-exec flag
+initially set.
+
The I<mode> string is an "C<fopen()>-like" string which would match
the regular expression C</^[I#]?[rwa]\+?[bt]?$/>.
which will be open in a manner compatible with the supplied mode
string, the call is thus equivalent to C<PerlIO_fdopen>. In this case
I<nargs> will be zero.
+The file descriptor may have the close-on-exec flag either set or clear;
+it is the responsibility of the layer that takes ownership of it to get
+the flag into the correct state.
If I<nargs> is greater than zero then it gives the number of arguments
passed to C<open>, otherwise it will be 1 if for example
C<PerlIO_open> was called. In simple cases SvPV_nolen(*args) is the
pathname to open.
-Having said all that translation-only layers do not need to provide
-C<Open()> at all, but rather leave the opening to a lower level layer
-and wait to be "pushed". If a layer does provide C<Open()> it should
-normally call the C<Open()> method of next layer down (if any) and
-then push itself on top if that succeeds.
+If a layer provides C<Open()> it should normally call the C<Open()>
+method of next layer down (if any) and then push itself on top if that
+succeeds. C<PerlIOBase_open> is provided to do exactly that, so in
+most cases you don't have to write your own C<Open()> method. If this
+method is not defined, other layers may have difficulty pushing
+themselves on top of it during open.
If C<PerlIO_push> was performed and open has failed, it must
C<PerlIO_pop> itself, since if it's not, the layer won't be removed
"ascii". (I<param> and I<flags> arguments can be ignored in most
cases)
+C<Dup> uses C<Getarg> to retrieve the argument originally passed to
+C<Pushed>, so you must implement this function if your layer has an
+extra argument to C<Pushed> and will ever be C<Dup>ed.
+
=item Fileno
IV (*Fileno)(pTHX_ PerlIO *f);
Return error indicator. C<PerlIOBase_error()> is normally sufficient.
-Returns 1 if there is an error (usually when C<PERLIO_F_ERROR> is set,
+Returns 1 if there is an error (usually when C<PERLIO_F_ERROR> is set),
0 otherwise.
=item Clearerr
=back
+=head2 Utilities
+
+To ask for the next layer down use PerlIONext(PerlIO *f).
+
+To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All
+this does is really just to check that the pointer is non-NULL and
+that the pointer behind that is non-NULL.)
+
+PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words,
+the C<PerlIOl*> pointer.
+
+PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type.
+
+Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either
+calls the I<callback> from the functions of the layer I<f> (just by
+the name of the IO function, like "Read") with the I<args>, or if
+there is no such callback, calls the I<base> version of the callback
+with the same args, or if the f is invalid, set errno to EBADF and
+return I<failure>.
+
+Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls
+the I<callback> of the functions of the layer I<f> with the I<args>,
+or if there is no such callback, set errno to EINVAL. Or if the f is
+invalid, set errno to EBADF and return I<failure>.
+
+Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls
+the I<callback> of the functions of the layer I<f> with the I<args>,
+or if there is no such callback, calls the I<base> version of the
+callback with the same args, or if the f is invalid, set errno to
+EBADF.
+
+Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the
+I<callback> of the functions of the layer I<f> with the I<args>, or if
+there is no such callback, set errno to EINVAL. Or if the f is
+invalid, set errno to EBADF.
+
=head2 Implementing PerlIO Layers
+If you find the implementation document unclear or not sufficient,
+look at the existing PerlIO layer implementations, which include:
+
+=over
+
+=item * C implementations
+
+The F<perlio.c> and F<perliol.h> in the Perl core implement the
+"unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending"
+layers, and also the "mmap" and "win32" layers if applicable.
+(The "win32" is currently unfinished and unused, to see what is used
+instead in Win32, see L<PerlIO/"Querying the layers of filehandles"> .)
+
+PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core.
+
+PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN.
+
+=item * Perl implementations
+
+PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN.
+
+=back
+
If you are creating a PerlIO layer, you may want to be lazy, in other
words, implement only the methods that interest you. The other methods
you can either replace with the "blank" methods
Unread PerlIOBase_unread
Write FAILURE
- FAILURE Set errno (to EINVAL in UNIXish, to LIB$_INVARG in VMS) and
- return -1 (for numeric return values) or NULL (for pointers)
+ FAILURE Set errno (to EINVAL in Unixish, to LIB$_INVARG in VMS)
+ and return -1 (for numeric return values) or NULL (for
+ pointers)
INHERITED Inherited from the layer below
SUCCESS Return 0 (for numeric return values) or a pointer
=head2 Extension Layers
-Layers can made available by extension modules. When an unknown layer
+Layers can be made available by extension modules. When an unknown layer
is encountered the PerlIO code will perform the equivalent of :
use PerlIO 'layer';