[perl5.git] / pod / perlnumber.pod

=head1 NAME

perlnumber - semantics of numbers and numeric operations in Perl

=head1 SYNOPSIS

    $n = 1234;		    # decimal integer
    $n = 0b1110011;	    # binary integer
    $n = 01234;		    # octal integer
    $n = 0x1234;	    # hexadecimal integer
    $n = 12.34e-56;	    # exponential notation
    $n = "-12.34e56";	    # number specified as a string
    $n = "1234";	    # number specified as a string

=head1 DESCRIPTION

This document describes how Perl internally handles numeric values.

Perl's operator overloading facility is completely ignored here.  Operator
overloading allows user-defined behaviors for numbers, such as operations
over arbitrarily large integers, floating points numbers with arbitrary
precision, operations over "exotic" numbers such as modular arithmetic or
p-adic arithmetic, and so on.  See L<overload> for details.

=head1 Storing numbers

Perl can internally represent numbers in 3 different ways: as native
integers, as native floating point numbers, and as decimal strings.
Decimal strings may have an exponential notation part, as in C<"12.34e-56">.
I<Native> here means "a format supported by the C compiler which was used
to build perl".

The term "native" does not mean quite as much when we talk about native
integers, as it does when native floating point numbers are involved.
The only implication of the term "native" on integers is that the limits for
the maximal and the minimal supported true integral quantities are close to
powers of 2.  However, "native" floats have a most fundamental
restriction: they may represent only those numbers which have a relatively
"short" representation when converted to a binary fraction.  For example,
0.9 cannot be represented by a native float, since the binary fraction
for 0.9 is infinite:

  binary0.1110011001100...

with the sequence C<1100> repeating again and again.  In addition to this
limitation,  the exponent of the binary number is also restricted when it
is represented as a floating point number.  On typical hardware, floating
point values can store numbers with up to 53 binary digits, and with binary
exponents between -1024 and 1024.  In decimal representation this is close
to 16 decimal digits and decimal exponents in the range of -304..304.
The upshot of all this is that Perl cannot store a number like
12345678901234567 as a floating point number on such architectures without
loss of information.

Similarly, decimal strings can represent only those numbers which have a
finite decimal expansion.  Being strings, and thus of arbitrary length, there
is no practical limit for the exponent or number of decimal digits for these
numbers.  (But realize that what we are discussing the rules for just the
I<storage> of these numbers.  The fact that you can store such "large" numbers
does not mean that the I<operations> over these numbers will use all
of the significant digits.
See L<"Numeric operators and numeric conversions"> for details.)

In fact numbers stored in the native integer format may be stored either
in the signed native form, or in the unsigned native form.  Thus the limits
for Perl numbers stored as native integers would typically be -2**31..2**32-1,
with appropriate modifications in the case of 64-bit integers.  Again, this
does not mean that Perl can do operations only over integers in this range:
it is possible to store many more integers in floating point format.

Summing up, Perl numeric values can store only those numbers which have
a finite decimal expansion or a "short" binary expansion.

=head1 Numeric operators and numeric conversions

As mentioned earlier, Perl can store a number in any one of three formats,
but most operators typically understand only one of those formats.  When
a numeric value is passed as an argument to such an operator, it will be
converted to the format understood by the operator.

Six such conversions are possible:

  native integer        --> native floating point	(*)
  native integer        --> decimal string
  native floating_point --> native integer		(*)
  native floating_point --> decimal string		(*)
  decimal string        --> native integer
  decimal string        --> native floating point	(*)

These conversions are governed by the following general rules:

=over 4

=item *

If the source number can be represented in the target form, that
representation is used.

=item *

If the source number is outside of the limits representable in the target form,
a representation of the closest limit is used.  (I<Loss of information>)

=item *

If the source number is between two numbers representable in the target form,
a representation of one of these numbers is used.  (I<Loss of information>)

=item *

In C<< native floating point --> native integer >> conversions the magnitude
of the result is less than or equal to the magnitude of the source.
(I<"Rounding to zero".>)

=item *

If the C<< decimal string --> native integer >> conversion cannot be done
without loss of information, the result is compatible with the conversion
sequence C<< decimal_string --> native_floating_point --> native_integer >>.
In particular, rounding is strongly biased to 0, though a number like
C<"0.99999999999999999999"> has a chance of being rounded to 1.

=back

B<RESTRICTION>: The conversions marked with C<(*)> above involve steps
performed by the C compiler.  In particular, bugs/features of the compiler
used may lead to breakage of some of the above rules.

=head1 Flavors of Perl numeric operations

Perl operations which take a numeric argument treat that argument in one
of four different ways: they may force it to one of the integer/floating/
string formats, or they may behave differently depending on the format of
the operand.  Forcing a numeric value to a particular format does not
change the number stored in the value.

All the operators which need an argument in the integer format treat the
argument as in modular arithmetic, e.g., C<mod 2**32> on a 32-bit
architecture.  C<sprintf "%u", -1> therefore provides the same result as
C<sprintf "%u", ~0>.

=over 4

=item Arithmetic operators

The binary operators C<+> C<-> C<*> C</> C<%> C<==> C<!=> C<E<gt>> C<E<lt>>
C<E<gt>=> C<E<lt>=> and the unary operators C<-> C<abs> and C<--> will
attempt to convert arguments to integers.  If both conversions are possible
without loss of precision, and the operation can be performed without
loss of precision then the integer result is used.  Otherwise arguments are
converted to floating point format and the floating point result is used.
The caching of conversions (as described above) means that the integer
conversion does not throw away fractional parts on floating point numbers.

=item ++

C<++> behaves as the other operators above, except that if it is a string
matching the format C</^[a-zA-Z]*[0-9]*\z/> the string increment described
in L<perlop> is used.

=item Arithmetic operators during C<use integer>

In scopes where C<use integer;> is in force, nearly all the operators listed
above will force their argument(s) into integer format, and return an integer
result.  The exceptions, C<abs>, C<++> and C<-->, do not change their
behavior with C<use integer;>

=item Other mathematical operators

Operators such as C<**>, C<sin> and C<exp> force arguments to floating point
format.

=item Bitwise operators

Arguments are forced into the integer format if not strings.

=item Bitwise operators during C<use integer>

forces arguments to integer format. Also shift operations internally use
signed integers rather than the default unsigned.

=item Operators which expect an integer

force the argument into the integer format.  This is applicable
to the third and fourth arguments of C<sysread>, for example.

=item Operators which expect a string

force the argument into the string format.  For example, this is
applicable to C<printf "%s", $value>.

=back

Though forcing an argument into a particular form does not change the
stored number, Perl remembers the result of such conversions.  In
particular, though the first such conversion may be time-consuming,
repeated operations will not need to redo the conversion.

=head1 AUTHOR

Ilya Zakharevich C<ilya@math.ohio-state.edu>

Editorial adjustments by Gurusamy Sarathy <gsar@ActiveState.com>

Updates for 5.8.0 by Nicholas Clark <nick@ccl4.org>

=head1 SEE ALSO

L<overload>, L<perlop>
Commit	Line	Data
ac65edd0 GS	1	=head1 NAME
	2
	3	perlnumber - semantics of numbers and numeric operations in Perl
	4
	5	=head1 SYNOPSIS
	6
78594626 NC	7	$n = 1234; # decimal integer
	8	$n = 0b1110011; # binary integer
	9	$n = 01234; # octal integer
	10	$n = 0x1234; # hexadecimal integer
	11	$n = 12.34e-56; # exponential notation
	12	$n = "-12.34e56"; # number specified as a string
	13	$n = "1234"; # number specified as a string
ac65edd0 GS	14
	15	=head1 DESCRIPTION
	16
	17	This document describes how Perl internally handles numeric values.
	18
	19	Perl's operator overloading facility is completely ignored here. Operator
	20	overloading allows user-defined behaviors for numbers, such as operations
	21	over arbitrarily large integers, floating points numbers with arbitrary
	22	precision, operations over "exotic" numbers such as modular arithmetic or
055fd3a9	23	p-adic arithmetic, and so on. See L<overload> for details.
ac65edd0 GS	24
	25	=head1 Storing numbers
	26
b38f6a39	27	Perl can internally represent numbers in 3 different ways: as native
ac65edd0 GS	28	integers, as native floating point numbers, and as decimal strings.
	29	Decimal strings may have an exponential notation part, as in C<"12.34e-56">.
	30	I<Native> here means "a format supported by the C compiler which was used
	31	to build perl".
	32
	33	The term "native" does not mean quite as much when we talk about native
	34	integers, as it does when native floating point numbers are involved.
	35	The only implication of the term "native" on integers is that the limits for
	36	the maximal and the minimal supported true integral quantities are close to
85add8c2	37	powers of 2. However, "native" floats have a most fundamental
ac65edd0 GS	38	restriction: they may represent only those numbers which have a relatively
ac65edd0 GS	39	"short" representation when converted to a binary fraction. For example,
4375e838	40	0.9 cannot be represented by a native float, since the binary fraction
ac65edd0 GS	41	for 0.9 is infinite:
	42
	43	binary0.1110011001100...
	44
	45	with the sequence C<1100> repeating again and again. In addition to this
	46	limitation, the exponent of the binary number is also restricted when it
	47	is represented as a floating point number. On typical hardware, floating
	48	point values can store numbers with up to 53 binary digits, and with binary
	49	exponents between -1024 and 1024. In decimal representation this is close
	50	to 16 decimal digits and decimal exponents in the range of -304..304.
	51	The upshot of all this is that Perl cannot store a number like
	52	12345678901234567 as a floating point number on such architectures without
	53	loss of information.
	54
b38f6a39	55	Similarly, decimal strings can represent only those numbers which have a
ac65edd0 GS	56	finite decimal expansion. Being strings, and thus of arbitrary length, there
	57	is no practical limit for the exponent or number of decimal digits for these
	58	numbers. (But realize that what we are discussing the rules for just the
	59	I<storage> of these numbers. The fact that you can store such "large" numbers
106325ad	60	does not mean that the I<operations> over these numbers will use all
ac65edd0	61	of the significant digits.
4a4eefd0	62	See L<"Numeric operators and numeric conversions"> for details.)
ac65edd0 GS	63
	64	In fact numbers stored in the native integer format may be stored either
	65	in the signed native form, or in the unsigned native form. Thus the limits
	66	for Perl numbers stored as native integers would typically be -231..232-1,
	67	with appropriate modifications in the case of 64-bit integers. Again, this
	68	does not mean that Perl can do operations only over integers in this range:
	69	it is possible to store many more integers in floating point format.
	70
	71	Summing up, Perl numeric values can store only those numbers which have
	72	a finite decimal expansion or a "short" binary expansion.
	73
	74	=head1 Numeric operators and numeric conversions
	75
	76	As mentioned earlier, Perl can store a number in any one of three formats,
	77	but most operators typically understand only one of those formats. When
	78	a numeric value is passed as an argument to such an operator, it will be
	79	converted to the format understood by the operator.
	80
	81	Six such conversions are possible:
	82
	83	native integer --> native floating point (*)
	84	native integer --> decimal string
	85	native floating_point --> native integer (*)
	86	native floating_point --> decimal string (*)
	87	decimal string --> native integer
	88	decimal string --> native floating point (*)
	89
	90	These conversions are governed by the following general rules:
	91
13a2d996	92	=over 4
ac65edd0 GS	93
	94	=item *
	95
	96	If the source number can be represented in the target form, that
	97	representation is used.
	98
	99	=item *
	100
	101	If the source number is outside of the limits representable in the target form,
	102	a representation of the closest limit is used. (I<Loss of information>)
	103
	104	=item *
	105
	106	If the source number is between two numbers representable in the target form,
	107	a representation of one of these numbers is used. (I<Loss of information>)
	108
	109	=item *
	110
	111	In C<< native floating point --> native integer >> conversions the magnitude
	112	of the result is less than or equal to the magnitude of the source.
	113	(I<"Rounding to zero".>)
	114
	115	=item *
	116
	117	If the C<< decimal string --> native integer >> conversion cannot be done
	118	without loss of information, the result is compatible with the conversion
	119	sequence C<< decimal_string --> native_floating_point --> native_integer >>.
	120	In particular, rounding is strongly biased to 0, though a number like
	121	C<"0.99999999999999999999"> has a chance of being rounded to 1.
	122
	123	=back
	124
	125	B<RESTRICTION>: The conversions marked with C<(*)> above involve steps
	126	performed by the C compiler. In particular, bugs/features of the compiler
	127	used may lead to breakage of some of the above rules.
	128
	129	=head1 Flavors of Perl numeric operations
	130
	131	Perl operations which take a numeric argument treat that argument in one
	132	of four different ways: they may force it to one of the integer/floating/
	133	string formats, or they may behave differently depending on the format of
	134	the operand. Forcing a numeric value to a particular format does not
	135	change the number stored in the value.
	136
	137	All the operators which need an argument in the integer format treat the
	138	argument as in modular arithmetic, e.g., C<mod 2**32> on a 32-bit
	139	architecture. C<sprintf "%u", -1> therefore provides the same result as
	140	C<sprintf "%u", ~0>.
	141
13a2d996	142	=over 4
ac65edd0	143
78594626	144	=item Arithmetic operators
ac65edd0	145
78594626 NC	146	The binary operators C<+> C<-> C<*> C</> C<%> C<==> C<!=> C<E<gt>> C<E<lt>>
	147	C<E<gt>=> C<E<lt>=> and the unary operators C<-> C<abs> and C<--> will
	148	attempt to convert arguments to integers. If both conversions are possible
	149	without loss of precision, and the operation can be performed without
	150	loss of precision then the integer result is used. Otherwise arguments are
	151	converted to floating point format and the floating point result is used.
	152	The caching of conversions (as described above) means that the integer
	153	conversion does not throw away fractional parts on floating point numbers.
ac65edd0	154
78594626	155	=item ++
ac65edd0	156
78594626 NC	157	C<++> behaves as the other operators above, except that if it is a string
	158	matching the format C</^[a-zA-Z][0-9]\z/> the string increment described
	159	in L<perlop> is used.
ac65edd0	160
78594626	161	=item Arithmetic operators during C<use integer>
ac65edd0	162
78594626 NC	163	In scopes where C<use integer;> is in force, nearly all the operators listed
	164	above will force their argument(s) into integer format, and return an integer
	165	result. The exceptions, C<abs>, C<++> and C<-->, do not change their
	166	behavior with C<use integer;>
ac65edd0	167
78594626 NC	168	=item Other mathematical operators
	169
	170	Operators such as C<**>, C<sin> and C<exp> force arguments to floating point
	171	format.
	172
	173	=item Bitwise operators
	174
	175	Arguments are forced into the integer format if not strings.
	176
	177	=item Bitwise operators during C<use integer>
	178
	179	forces arguments to integer format. Also shift operations internally use
	180	signed integers rather than the default unsigned.
ac65edd0 GS	181
	182	=item Operators which expect an integer
	183
	184	force the argument into the integer format. This is applicable
	185	to the third and fourth arguments of C<sysread>, for example.
	186
	187	=item Operators which expect a string
	188
	189	force the argument into the string format. For example, this is
	190	applicable to C<printf "%s", $value>.
	191
	192	=back
	193
	194	Though forcing an argument into a particular form does not change the
	195	stored number, Perl remembers the result of such conversions. In
	196	particular, though the first such conversion may be time-consuming,
	197	repeated operations will not need to redo the conversion.
	198
	199	=head1 AUTHOR
	200
	201	Ilya Zakharevich C<ilya@math.ohio-state.edu>
	202
	203	Editorial adjustments by Gurusamy Sarathy <gsar@ActiveState.com>
	204
78594626 NC	205	Updates for 5.8.0 by Nicholas Clark <nick@ccl4.org>
78594626 NC	206
ac65edd0 GS	207	=head1 SEE ALSO
ac65edd0 GS	208
78594626	209	L<overload>, L<perlop>