5 # Map layer name to package that defines it
14 if (exists $alias{$layer})
16 $layer = $alias{$layer}
20 $layer = "${class}::$layer";
22 eval { require $layer =~ s{::}{/}gr . '.pm' };
27 sub F_UTF8 () { 0x8000 }
34 PerlIO - On demand loader for PerlIO layers and root of PerlIO::* name space
38 # support platform-native and CRLF text files
39 open(my $fh, "<:crlf", "my.txt") or die "open failed: $!";
41 # append UTF-8 encoded text
42 open(my $fh, ">>:encoding(UTF-8)", "some.log")
43 or die "open failed: $!";
45 # portably open a binary file for reading
46 open(my $fh, "<", "his.jpg") or die "open failed: $!";
47 binmode($fh) or die "binmode failed: $!";
50 PERLIO=:perlio perl ....
54 When an undefined layer 'foo' is encountered in an C<open> or
55 C<binmode> layer specification then C code performs the equivalent of:
59 The Perl code in PerlIO.pm then attempts to locate a layer by doing
63 Otherwise the C<PerlIO> package is a place holder for additional
64 PerlIO related functions.
68 Generally speaking, PerlIO layers (previously sometimes referred to as
69 "disciplines") are an ordered stack applied to a filehandle (specified as
70 a space- or colon-separated list, conventionally written with a leading
71 colon). Each layer performs some operation on any input or output, except
72 when bypassed such as with C<sysread> or C<syswrite>. Read operations go
73 through the stack in the order they are set (left to right), and write
74 operations in the reverse order.
76 There are also layers which actually just set flags on lower layers, or
77 layers that modify the current stack but don't persist on the stack
78 themselves; these are referred to as pseudo-layers.
80 When opening a handle, it will be opened with any layers specified
81 explicitly in the open() call (or the platform defaults, if specified as
82 a colon with no following layers).
84 If layers are not explicitly specified, the handle will be opened with the
85 layers specified by the L<${^OPEN}|perlvar/"${^OPEN}"> variable (usually
86 set by using the L<open> pragma for a lexical scope, or the C<-C>
87 command-line switch or C<PERL_UNICODE> environment variable for the main
90 If layers are not specified in the open() call or C<${^OPEN}> variable,
91 the handle will be opened with the default layer stack configured for that
92 architecture; see L</"Defaults and how to override them">.
94 Some layers will automatically insert required lower level layers if not
95 present; for example C<:perlio> will insert C<:unix> below itself for low
96 level IO, and C<:encoding> will insert the platform defaults for buffered
99 The C<binmode> function can be called on an opened handle to push
100 additional layers onto the stack, which may also modify the existing
101 layers. C<binmode> called with no layers will remove or unset any
102 existing layers which transform the byte stream, making the handle
103 suitable for binary data.
105 The following layers are currently defined:
111 Lowest level layer which provides basic PerlIO operations in terms of
112 UNIX/POSIX numeric file descriptor calls
113 (open(), read(), write(), lseek(), close()).
114 It is used even on non-Unix architectures, and most other layers operate on
119 Layer which calls C<fread>, C<fwrite> and C<fseek>/C<ftell> etc. Note
120 that as this is "real" stdio it will ignore any layers beneath it and
121 go straight to the operating system via the C library as usual.
122 This layer implements both low level IO and buffering, but is rarely used
123 on modern architectures.
127 A from scratch implementation of buffering for PerlIO. Provides fast
128 access to the buffer for C<sv_gets> which implements Perl's readline/E<lt>E<gt>
129 and in general attempts to minimize data copying.
131 C<:perlio> will insert a C<:unix> layer below itself to do low level IO.
135 A layer that implements DOS/Windows like CRLF line endings. On read
136 converts pairs of CR,LF to a single "\n" newline character. On write
137 converts each "\n" to a CR,LF pair. Note that this layer will silently
138 refuse to be pushed on top of itself.
140 It currently does I<not> mimic MS-DOS as far as treating of Control-Z
141 as being an end-of-file marker.
143 On DOS/Windows like architectures where this layer is part of the defaults,
144 it also acts like the C<:perlio> layer, and removing the CRLF translation
145 (such as with C<:raw>) will only unset the CRLF translation flag. Since
146 Perl 5.14, you can also apply another C<:crlf> layer later, such as when
147 the CRLF translation must occur after an encoding layer. On other
148 architectures, it is a mundane CRLF translation layer and can be added and
151 # translate CRLF after encoding on Perl 5.14 or newer
152 binmode $fh, ":raw:encoding(UTF-16LE):crlf"
153 or die "binmode failed: $!";
157 Pseudo-layer that declares that the stream accepts Perl's I<internal>
158 upgraded encoding of characters, which is approximately UTF-8 on ASCII
159 machines, but UTF-EBCDIC on EBCDIC machines. This allows any character
160 Perl can represent to be read from or written to the stream.
162 This layer (which actually sets a flag on the preceding layer, and is
163 implicitly set by any C<:encoding> layer) does not translate or validate
164 byte sequences. It instead indicates that the byte stream will have been
165 arranged by other layers to be provided in Perl's internal upgraded
166 encoding, which Perl code (and correctly written XS code) will interpret
167 as decoded Unicode characters.
169 B<CAUTION>: Do not use this layer to translate from UTF-8 bytes, as
170 invalid UTF-8 or binary data will result in malformed Perl strings. It is
171 unlikely to produce invalid UTF-8 when used for output, though it will
172 instead produce UTF-EBCDIC on EBCDIC systems. The C<:encoding(UTF-8)>
173 layer (hyphen is significant) is preferred as it will ensure translation
174 between valid UTF-8 bytes and valid Unicode characters.
178 This is the inverse of the C<:utf8> pseudo-layer. It turns off the flag
179 on the layer below so that data read from it is considered to
180 be Perl's internal downgraded encoding, thus interpreted as the native
181 single-byte encoding of Latin-1 or EBCDIC. Likewise on output Perl will
182 warn if a "wide" character (a codepoint not in the range 0..255) is
183 written to a such a stream.
185 This is very dangerous to push on a handle using an C<:encoding> layer,
186 as such a layer assumes to be working with Perl's internal upgraded
187 encoding, so you will likely get a mangled result. Instead use C<:raw> or
188 C<:pop> to remove encoding layers.
192 The C<:raw> pseudo-layer is I<defined> as being identical to calling
193 C<binmode($fh)> - the stream is made suitable for passing binary data,
194 i.e. each byte is passed as-is. The stream will still be buffered
195 (but this was not always true before Perl 5.14).
197 In Perl 5.6 and some books the C<:raw> layer is documented as the inverse
198 of the C<:crlf> layer. That is no longer the case - other layers which
199 would alter the binary nature of the stream are also disabled. If you
200 want UNIX line endings on a platform that normally does CRLF translation,
201 but still want UTF-8 or encoding defaults, the appropriate thing to do is
202 to add C<:perlio> to the PERLIO environment variable, or open the handle
203 explicitly with that layer, to replace the platform default of C<:crlf>.
205 The implementation of C<:raw> is as a pseudo-layer which when "pushed"
206 pops itself and then any layers which would modify the binary data stream.
207 (Undoing C<:utf8> and C<:crlf> may be implemented by clearing flags
208 rather than popping layers but that is an implementation detail.)
210 As a consequence of the fact that C<:raw> normally pops layers,
211 it usually only makes sense to have it as the only or first element in
212 a layer specification. When used as the first element it provides
213 a known base on which to build e.g.
215 open(my $fh,">:raw:encoding(UTF-8)",...)
216 or die "open failed: $!";
218 will construct a "binary" stream regardless of the platform defaults,
219 but then enable UTF-8 translation.
223 A pseudo-layer that removes the top-most layer. Gives Perl code a
224 way to manipulate the layer stack. Note that C<:pop> only works on
225 real layers and will not undo the effects of pseudo-layers or flags
226 like C<:utf8>. An example of a possible use might be:
228 open(my $fh,...) or die "open failed: $!";
230 binmode($fh,":encoding(...)") or die "binmode failed: $!";
231 # next chunk is encoded
233 binmode($fh,":pop") or die "binmode failed: $!";
236 A more elegant (and safer) interface is needed.
240 On Win32 platforms this I<experimental> layer uses the native "handle" IO
241 rather than the unix-like numeric file descriptor layer. Known to be
242 buggy as of Perl 5.8.2.
248 It is possible to write custom layers in addition to the above builtin
249 ones, both in C/XS and Perl, as a module named C<< PerlIO::<layer name> >>.
250 Some custom layers come with the Perl distribution.
256 Use C<:encoding(ENCODING)> to transparently do character set and encoding
257 transformations, for example from Shift-JIS to Unicode. Note that an
258 C<:encoding> also enables C<:utf8>. See L<PerlIO::encoding> for more
263 A layer which implements "reading" of files by using C<mmap()> to
264 make a (whole) file appear in the process's address space, and then
265 using that as PerlIO's "buffer". This I<may> be faster in certain
266 circumstances for large files, and may result in less physical memory
267 use when multiple processes are reading the same file.
269 Files which are not C<mmap()>-able revert to behaving like the C<:perlio>
270 layer. Writes also behave like the C<:perlio> layer, as C<mmap()> for write
271 needs extra house-keeping (to extend the file) which negates any advantage.
273 The C<:mmap> layer will not exist if the platform does not support C<mmap()>.
274 See L<PerlIO::mmap> for more information.
278 C<:via(MODULE)> allows a transformation to be applied by an arbitrary Perl
279 module, for example compression / decompression, encryption / decryption.
280 See L<PerlIO::via> for more information.
284 A layer implementing "in memory" files using scalar variables,
285 automatically used in place of the platform defaults for IO when opening
286 such a handle. As such, the scalar is expected to act like a file, only
287 containing or storing bytes. See L<PerlIO::scalar> for more information.
291 =head2 Alternatives to raw
293 To get a binary stream an alternate method is to use:
295 open(my $fh,"<","whatever") or die "open failed: $!";
296 binmode($fh) or die "binmode failed: $!";
298 This has the advantage of being backward compatible with older versions
299 of Perl that did not use PerlIO or where C<:raw> was buggy (as it was
302 To get an unbuffered stream specify an unbuffered layer (e.g. C<:unix>)
305 open(my $fh,"<:unix",$path) or die "open failed: $!";
307 =head2 Defaults and how to override them
309 If the platform is MS-DOS like and normally does CRLF to "\n"
310 translation for text files then the default layers are:
314 Otherwise if C<Configure> found out how to do "fast" IO using the system's
315 stdio (not common on modern architectures), then the default layers are:
319 Otherwise the default layers are
323 Note that the "default stack" depends on the operating system and on the
324 Perl version, and both the compile-time and runtime configurations of
325 Perl. The default can be overridden by setting the environment variable
326 PERLIO to a space or colon separated list of layers, however this cannot
327 be used to set layers that require loading modules like C<:encoding>.
329 This can be used to see the effect of/bugs in the various layers e.g.
332 PERLIO=:stdio ./perl harness
333 PERLIO=:perlio ./perl harness
335 For the various values of PERLIO see L<perlrun/PERLIO>.
337 The following table summarizes the default layers on UNIX-like and
338 DOS-like platforms and depending on the setting of C<$ENV{PERLIO}>:
340 PERLIO UNIX-like DOS-like
341 ------ --------- --------
342 unset / "" :unix:perlio / :stdio [1] :unix:crlf
344 :perlio :unix:perlio :unix:perlio
346 # [1] ":stdio" if Configure found out how to do "fast stdio" (depends
347 # on the stdio implementation) and in Perl 5.8, else ":unix:perlio"
349 =head2 Querying the layers of filehandles
351 The following returns the B<names> of the PerlIO layers on a filehandle.
353 my @layers = PerlIO::get_layers($fh); # Or FH, *FH, "FH".
355 The layers are returned in the order an open() or binmode() call would
356 use them, and without colons.
358 By default the layers from the input side of the filehandle are
359 returned; to get the output side, use the optional C<output> argument:
361 my @layers = PerlIO::get_layers($fh, output => 1);
363 (Usually the layers are identical on either side of a filehandle but
364 for example with sockets there may be differences.)
366 There is no set_layers(), nor does get_layers() return a tied array
367 mirroring the stack, or anything fancy like that. This is not
368 accidental or unintentional. The PerlIO layer stack is a bit more
369 complicated than just a stack (see for example the behaviour of C<:raw>).
370 You are supposed to use open() and binmode() to manipulate the stack.
372 B<Implementation details follow, please close your eyes.>
374 The arguments to layers are by default returned in parentheses after
375 the name of the layer, and certain layers (like C<:utf8>) are not real
376 layers but instead flags on real layers; to get all of these returned
377 separately, use the optional C<details> argument:
379 my @layer_and_args_and_flags = PerlIO::get_layers($fh, details => 1);
381 The result will be up to be three times the number of layers:
382 the first element will be a name, the second element the arguments
383 (unspecified arguments will be C<undef>), the third element the flags,
384 the fourth element a name again, and so forth.
386 B<You may open your eyes now.>
390 Nick Ing-Simmons E<lt>nick@ing-simmons.netE<gt>
394 L<perlfunc/"binmode">, L<perlfunc/"open">, L<perlunicode>, L<perliol>,