This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Move a test from t/lib/warnings/sv to .../9uninit
[perl5.git] / lib / PerlIO.pm
CommitLineData
1141d9f8
NIS
1package PerlIO;
2
2703c26d 3our $VERSION = '1.07';
8de1277c 4
1141d9f8 5# Map layer name to package that defines it
c1a61b17 6our %alias;
1141d9f8
NIS
7
8sub import
9{
10 my $class = shift;
11 while (@_)
12 {
13 my $layer = shift;
14 if (exists $alias{$layer})
15 {
16 $layer = $alias{$layer}
17 }
18 else
19 {
20 $layer = "${class}::$layer";
21 }
22 eval "require $layer";
23 warn $@ if $@;
24 }
25}
26
39f7a870
JH
27sub F_UTF8 () { 0x8000 }
28
1141d9f8
NIS
291;
30__END__
b3d30bf7
NIS
31
32=head1 NAME
33
7d3b96bb 34PerlIO - On demand loader for PerlIO layers and root of PerlIO::* name space
b3d30bf7
NIS
35
36=head1 SYNOPSIS
37
a7845df8 38 open($fh,"<:crlf", "my.txt"); # support platform-native and CRLF text files
1cbfc93d
NIS
39
40 open($fh,"<","his.jpg"); # portably open a binary file for reading
41 binmode($fh);
7d3b96bb
NIS
42
43 Shell:
44 PERLIO=perlio perl ....
b3d30bf7
NIS
45
46=head1 DESCRIPTION
47
ec28694c
JH
48When an undefined layer 'foo' is encountered in an C<open> or
49C<binmode> layer specification then C code performs the equivalent of:
b3d30bf7
NIS
50
51 use PerlIO 'foo';
52
53The perl code in PerlIO.pm then attempts to locate a layer by doing
54
55 require PerlIO::foo;
56
47bfe92f
JH
57Otherwise the C<PerlIO> package is a place holder for additional
58PerlIO related functions.
b3d30bf7 59
7d3b96bb 60The following layers are currently defined:
b3d30bf7 61
7d3b96bb
NIS
62=over 4
63
3d897973 64=item :unix
7d3b96bb 65
3d897973
IT
66Lowest level layer which provides basic PerlIO operations in terms of
67UNIX/POSIX numeric file descriptor calls
68(open(), read(), write(), lseek(), close()).
7d3b96bb 69
3d897973 70=item :stdio
7d3b96bb 71
47bfe92f
JH
72Layer which calls C<fread>, C<fwrite> and C<fseek>/C<ftell> etc. Note
73that as this is "real" stdio it will ignore any layers beneath it and
9ec269cb 74go straight to the operating system via the C library as usual.
7d3b96bb 75
3d897973 76=item :perlio
7d3b96bb 77
3d897973
IT
78A from scratch implementation of buffering for PerlIO. Provides fast
79access to the buffer for C<sv_gets> which implements perl's readline/E<lt>E<gt>
80and in general attempts to minimize data copying.
7d3b96bb 81
3d897973 82C<:perlio> will insert a C<:unix> layer below itself to do low level IO.
7d3b96bb 83
3d897973 84=item :crlf
7d3b96bb 85
3d897973
IT
86A layer that implements DOS/Windows like CRLF line endings. On read
87converts pairs of CR,LF to a single "\n" newline character. On write
8dcd593c
LT
88converts each "\n" to a CR,LF pair. Note that this layer will silently
89refuse to be pushed on top of itself.
3d897973
IT
90
91It currently does I<not> mimic MS-DOS as far as treating of Control-Z
92as being an end-of-file marker.
93
3d897973
IT
94Based on the C<:perlio> layer.
95
96=item :mmap
97
98A layer which implements "reading" of files by using C<mmap()> to
9ec269cb 99make a (whole) file appear in the process's address space, and then
3d897973
IT
100using that as PerlIO's "buffer". This I<may> be faster in certain
101circumstances for large files, and may result in less physical memory
102use when multiple processes are reading the same file.
103
104Files which are not C<mmap()>-able revert to behaving like the C<:perlio>
9ec269cb 105layer. Writes also behave like the C<:perlio> layer, as C<mmap()> for write
3d897973
IT
106needs extra house-keeping (to extend the file) which negates any advantage.
107
9ec269cb 108The C<:mmap> layer will not exist if the platform does not support C<mmap()>.
3d897973
IT
109
110=item :utf8
7d3b96bb 111
2575c402 112Declares that the stream accepts perl's I<internal> encoding of
47bfe92f
JH
113characters. (Which really is UTF-8 on ASCII machines, but is
114UTF-EBCDIC on EBCDIC machines.) This allows any character perl can
115represent to be read from or written to the stream. The UTF-X encoding
116is chosen to render simple text parts (i.e. non-accented letters,
117digits and common punctuation) human readable in the encoded file.
118
119Here is how to write your native data out using UTF-8 (or UTF-EBCDIC)
120and then read it back in.
121
122 open(F, ">:utf8", "data.utf");
123 print F $out;
124 close(F);
125
126 open(F, "<:utf8", "data.utf");
127 $in = <F>;
128 close(F);
7d3b96bb 129
740d4bb2 130Note that this layer does not validate byte sequences. For reading
9ec269cb 131input, using C<:encoding(utf8)> instead of bare C<:utf8> is strongly
740d4bb2
JW
132recommended.
133
3d897973 134=item :bytes
c1a61b17 135
9ec269cb 136This is the inverse of the C<:utf8> layer. It turns off the flag
c1a61b17 137on the layer below so that data read from it is considered to
9ec269cb 138be "octets" i.e. characters in the range 0..255 only. Likewise
c1a61b17
NIS
139on output perl will warn if a "wide" character is written
140to a such a stream.
141
3d897973 142=item :raw
7d3b96bb 143
0226bbdb 144The C<:raw> layer is I<defined> as being identical to calling
9ec269cb 145C<binmode($fh)> - the stream is made suitable for passing binary data,
18aba96f 146i.e. each byte is passed as-is. The stream will still be
3d897973
IT
147buffered.
148
149In Perl 5.6 and some books the C<:raw> layer (previously sometimes also
150referred to as a "discipline") is documented as the inverse of the
151C<:crlf> layer. That is no longer the case - other layers which would
9ec269cb 152alter the binary nature of the stream are also disabled. If you want UNIX
3d897973 153line endings on a platform that normally does CRLF translation, but still
9ec269cb
SL
154want UTF-8 or encoding defaults, the appropriate thing to do is to add
155C<:perlio> to the PERLIO environment variable.
1cbfc93d 156
0226bbdb
NIS
157The implementation of C<:raw> is as a pseudo-layer which when "pushed"
158pops itself and then any layers which do not declare themselves as suitable
159for binary data. (Undoing :utf8 and :crlf are implemented by clearing
39f7a870 160flags rather than popping layers but that is an implementation detail.)
01e6739c 161
9ec269cb 162As a consequence of the fact that C<:raw> normally pops layers,
39f7a870
JH
163it usually only makes sense to have it as the only or first element in
164a layer specification. When used as the first element it provides
0226bbdb 165a known base on which to build e.g.
7d3b96bb 166
0226bbdb 167 open($fh,":raw:utf8",...)
7d3b96bb 168
0226bbdb 169will construct a "binary" stream, but then enable UTF-8 translation.
b3d30bf7 170
3d897973 171=item :pop
4ec2216f
NIS
172
173A pseudo layer that removes the top-most layer. Gives perl code
174a way to manipulate the layer stack. Should be considered
175as experimental. Note that C<:pop> only works on real layers
176and will not undo the effects of pseudo layers like C<:utf8>.
177An example of a possible use might be:
178
179 open($fh,...)
180 ...
181 binmode($fh,":encoding(...)"); # next chunk is encoded
182 ...
3c4b39be 183 binmode($fh,":pop"); # back to un-encoded
4ec2216f
NIS
184
185A more elegant (and safer) interface is needed.
186
3d897973
IT
187=item :win32
188
9ec269cb
SL
189On Win32 platforms this I<experimental> layer uses the native "handle" IO
190rather than the unix-like numeric file descriptor layer. Known to be
3d897973
IT
191buggy as of perl 5.8.2.
192
7d3b96bb
NIS
193=back
194
39f7a870
JH
195=head2 Custom Layers
196
197It is possible to write custom layers in addition to the above builtin
198ones, both in C/XS and Perl. Two such layers (and one example written
199in Perl using the latter) come with the Perl distribution.
200
201=over 4
202
203=item :encoding
204
205Use C<:encoding(ENCODING)> either in open() or binmode() to install
9ec269cb 206a layer that transparently does character set and encoding transformations,
e76300d6
JH
207for example from Shift-JIS to Unicode. Note that under C<stdio>
208an C<:encoding> also enables C<:utf8>. See L<PerlIO::encoding>
209for more information.
39f7a870
JH
210
211=item :via
212
213Use C<:via(MODULE)> either in open() or binmode() to install a layer
214that does whatever transformation (for example compression /
215decompression, encryption / decryption) to the filehandle.
216See L<PerlIO::via> for more information.
217
218=back
219
01e6739c
NIS
220=head2 Alternatives to raw
221
0226bbdb 222To get a binary stream an alternate method is to use:
01e6739c 223
0226bbdb 224 open($fh,"whatever")
01e6739c
NIS
225 binmode($fh);
226
9ec269cb 227this has the advantage of being backward compatible with how such things have
01e6739c 228had to be coded on some platforms for years.
01e6739c 229
9ec269cb 230To get an unbuffered stream specify an unbuffered layer (e.g. C<:unix>)
0226bbdb 231in the open call:
01e6739c
NIS
232
233 open($fh,"<:unix",$path)
234
7d3b96bb
NIS
235=head2 Defaults and how to override them
236
ec28694c
JH
237If the platform is MS-DOS like and normally does CRLF to "\n"
238translation for text files then the default layers are :
7d3b96bb
NIS
239
240 unix crlf
241
47bfe92f
JH
242(The low level "unix" layer may be replaced by a platform specific low
243level layer.)
7d3b96bb 244
9ec269cb 245Otherwise if C<Configure> found out how to do "fast" IO using the system's
046e4a6a 246stdio, then the default layers are:
7d3b96bb
NIS
247
248 unix stdio
249
250Otherwise the default layers are
251
252 unix perlio
253
254These defaults may change once perlio has been better tested and tuned.
255
47bfe92f 256The default can be overridden by setting the environment variable
39f7a870
JH
257PERLIO to a space separated list of layers (C<unix> or platform low
258level layer is always pushed first).
47bfe92f 259
7d3b96bb
NIS
260This can be used to see the effect of/bugs in the various layers e.g.
261
262 cd .../perl/t
263 PERLIO=stdio ./perl harness
264 PERLIO=perlio ./perl harness
265
9ec269cb 266For the various values of PERLIO see L<perlrun/PERLIO>.
3b0db4f9 267
4c11337c 268=head2 Querying the layers of filehandles
39f7a870
JH
269
270The following returns the B<names> of the PerlIO layers on a filehandle.
271
9d569fce 272 my @layers = PerlIO::get_layers($fh); # Or FH, *FH, "FH".
39f7a870
JH
273
274The layers are returned in the order an open() or binmode() call would
f0fd62e2 275use them. Note that the "default stack" depends on the operating
cc83745d
JH
276system and on the Perl version, and both the compile-time and
277runtime configurations of Perl.
79d9a4d7 278
79d9a4d7 279The following table summarizes the default layers on UNIX-like and
9ec269cb 280DOS-like platforms and depending on the setting of C<$ENV{PERLIO}>:
79d9a4d7 281
f0fd62e2 282 PERLIO UNIX-like DOS-like
a7845df8 283 ------ --------- --------
f0fd62e2
JH
284 unset / "" unix perlio / stdio [1] unix crlf
285 stdio unix perlio / stdio [1] stdio
286 perlio unix perlio unix perlio
287 mmap unix mmap unix mmap
39f7a870 288
f0fd62e2
JH
289 # [1] "stdio" if Configure found out how to do "fast stdio" (depends
290 # on the stdio implementation) and in Perl 5.8, otherwise "unix perlio"
046e4a6a 291
9ec269cb
SL
292By default the layers from the input side of the filehandle are
293returned; to get the output side, use the optional C<output> argument:
39f7a870 294
2ae85e59 295 my @layers = PerlIO::get_layers($fh, output => 1);
39f7a870
JH
296
297(Usually the layers are identical on either side of a filehandle but
2ae85e59
JH
298for example with sockets there may be differences, or if you have
299been using the C<open> pragma.)
39f7a870 300
92a3e63c
JH
301There is no set_layers(), nor does get_layers() return a tied array
302mirroring the stack, or anything fancy like that. This is not
303accidental or unintentional. The PerlIO layer stack is a bit more
304complicated than just a stack (see for example the behaviour of C<:raw>).
305You are supposed to use open() and binmode() to manipulate the stack.
306
39f7a870
JH
307B<Implementation details follow, please close your eyes.>
308
9ec269cb 309The arguments to layers are by default returned in parentheses after
39f7a870 310the name of the layer, and certain layers (like C<utf8>) are not real
9ec269cb
SL
311layers but instead flags on real layers; to get all of these returned
312separately, use the optional C<details> argument:
39f7a870 313
2ae85e59 314 my @layer_and_args_and_flags = PerlIO::get_layers($fh, details => 1);
39f7a870
JH
315
316The result will be up to be three times the number of layers:
317the first element will be a name, the second element the arguments
318(unspecified arguments will be C<undef>), the third element the flags,
319the fourth element a name again, and so forth.
320
321B<You may open your eyes now.>
322
7d3b96bb
NIS
323=head1 AUTHOR
324
325Nick Ing-Simmons E<lt>nick@ing-simmons.netE<gt>
326
327=head1 SEE ALSO
328
39f7a870
JH
329L<perlfunc/"binmode">, L<perlfunc/"open">, L<perlunicode>, L<perliol>,
330L<Encode>
7d3b96bb
NIS
331
332=cut