Commit | Line | Data |
---|---|---|

ac65edd0 GS |
1 | =head1 NAME |

2 | ||

3 | perlnumber - semantics of numbers and numeric operations in Perl | |

4 | ||

5 | =head1 SYNOPSIS | |

6 | ||

7 | $n = 1234; # decimal integer | |

8 | $n = 0b1110011; # binary integer | |

9 | $n = 01234; # octal integer | |

10 | $n = 0x1234; # hexadecimal integer | |

11 | $n = 12.34e-56; # exponential notation | |

12 | $n = "-12.34e56"; # number specified as a string | |

13 | $n = "1234"; # number specified as a string | |

14 | $n = v49.50.51.52; # number specified as a string, which in | |

15 | # turn is specified in terms of numbers :-) | |

16 | ||

17 | =head1 DESCRIPTION | |

18 | ||

19 | This document describes how Perl internally handles numeric values. | |

20 | ||

21 | Perl's operator overloading facility is completely ignored here. Operator | |

22 | overloading allows user-defined behaviors for numbers, such as operations | |

23 | over arbitrarily large integers, floating points numbers with arbitrary | |

24 | precision, operations over "exotic" numbers such as modular arithmetic or | |

055fd3a9 | 25 | p-adic arithmetic, and so on. See L<overload> for details. |

ac65edd0 GS |
26 | |

27 | =head1 Storing numbers | |

28 | ||

b38f6a39 | 29 | Perl can internally represent numbers in 3 different ways: as native |

ac65edd0 GS |
30 | integers, as native floating point numbers, and as decimal strings. |

31 | Decimal strings may have an exponential notation part, as in C<"12.34e-56">. | |

32 | I<Native> here means "a format supported by the C compiler which was used | |

33 | to build perl". | |

34 | ||

35 | The term "native" does not mean quite as much when we talk about native | |

36 | integers, as it does when native floating point numbers are involved. | |

37 | The only implication of the term "native" on integers is that the limits for | |

38 | the maximal and the minimal supported true integral quantities are close to | |

85add8c2 | 39 | powers of 2. However, "native" floats have a most fundamental |

ac65edd0 GS |
40 | restriction: they may represent only those numbers which have a relatively |

41 | "short" representation when converted to a binary fraction. For example, | |

4375e838 | 42 | 0.9 cannot be represented by a native float, since the binary fraction |

ac65edd0 GS |
43 | for 0.9 is infinite: |

44 | ||

45 | binary0.1110011001100... | |

46 | ||

47 | with the sequence C<1100> repeating again and again. In addition to this | |

48 | limitation, the exponent of the binary number is also restricted when it | |

49 | is represented as a floating point number. On typical hardware, floating | |

50 | point values can store numbers with up to 53 binary digits, and with binary | |

51 | exponents between -1024 and 1024. In decimal representation this is close | |

52 | to 16 decimal digits and decimal exponents in the range of -304..304. | |

53 | The upshot of all this is that Perl cannot store a number like | |

54 | 12345678901234567 as a floating point number on such architectures without | |

55 | loss of information. | |

56 | ||

b38f6a39 | 57 | Similarly, decimal strings can represent only those numbers which have a |

ac65edd0 GS |
58 | finite decimal expansion. Being strings, and thus of arbitrary length, there |

59 | is no practical limit for the exponent or number of decimal digits for these | |

60 | numbers. (But realize that what we are discussing the rules for just the | |

61 | I<storage> of these numbers. The fact that you can store such "large" numbers | |

106325ad | 62 | does not mean that the I<operations> over these numbers will use all |

ac65edd0 | 63 | of the significant digits. |

4a4eefd0 | 64 | See L<"Numeric operators and numeric conversions"> for details.) |

ac65edd0 GS |
65 | |

66 | In fact numbers stored in the native integer format may be stored either | |

67 | in the signed native form, or in the unsigned native form. Thus the limits | |

68 | for Perl numbers stored as native integers would typically be -2**31..2**32-1, | |

69 | with appropriate modifications in the case of 64-bit integers. Again, this | |

70 | does not mean that Perl can do operations only over integers in this range: | |

71 | it is possible to store many more integers in floating point format. | |

72 | ||

73 | Summing up, Perl numeric values can store only those numbers which have | |

74 | a finite decimal expansion or a "short" binary expansion. | |

75 | ||

76 | =head1 Numeric operators and numeric conversions | |

77 | ||

78 | As mentioned earlier, Perl can store a number in any one of three formats, | |

79 | but most operators typically understand only one of those formats. When | |

80 | a numeric value is passed as an argument to such an operator, it will be | |

81 | converted to the format understood by the operator. | |

82 | ||

83 | Six such conversions are possible: | |

84 | ||

85 | native integer --> native floating point (*) | |

86 | native integer --> decimal string | |

87 | native floating_point --> native integer (*) | |

88 | native floating_point --> decimal string (*) | |

89 | decimal string --> native integer | |

90 | decimal string --> native floating point (*) | |

91 | ||

92 | These conversions are governed by the following general rules: | |

93 | ||

94 | =over | |

95 | ||

96 | =item * | |

97 | ||

98 | If the source number can be represented in the target form, that | |

99 | representation is used. | |

100 | ||

101 | =item * | |

102 | ||

103 | If the source number is outside of the limits representable in the target form, | |

104 | a representation of the closest limit is used. (I<Loss of information>) | |

105 | ||

106 | =item * | |

107 | ||

108 | If the source number is between two numbers representable in the target form, | |

109 | a representation of one of these numbers is used. (I<Loss of information>) | |

110 | ||

111 | =item * | |

112 | ||

113 | In C<< native floating point --> native integer >> conversions the magnitude | |

114 | of the result is less than or equal to the magnitude of the source. | |

115 | (I<"Rounding to zero".>) | |

116 | ||

117 | =item * | |

118 | ||

119 | If the C<< decimal string --> native integer >> conversion cannot be done | |

120 | without loss of information, the result is compatible with the conversion | |

121 | sequence C<< decimal_string --> native_floating_point --> native_integer >>. | |

122 | In particular, rounding is strongly biased to 0, though a number like | |

123 | C<"0.99999999999999999999"> has a chance of being rounded to 1. | |

124 | ||

125 | =back | |

126 | ||

127 | B<RESTRICTION>: The conversions marked with C<(*)> above involve steps | |

128 | performed by the C compiler. In particular, bugs/features of the compiler | |

129 | used may lead to breakage of some of the above rules. | |

130 | ||

131 | =head1 Flavors of Perl numeric operations | |

132 | ||

133 | Perl operations which take a numeric argument treat that argument in one | |

134 | of four different ways: they may force it to one of the integer/floating/ | |

135 | string formats, or they may behave differently depending on the format of | |

136 | the operand. Forcing a numeric value to a particular format does not | |

137 | change the number stored in the value. | |

138 | ||

139 | All the operators which need an argument in the integer format treat the | |

140 | argument as in modular arithmetic, e.g., C<mod 2**32> on a 32-bit | |

141 | architecture. C<sprintf "%u", -1> therefore provides the same result as | |

142 | C<sprintf "%u", ~0>. | |

143 | ||

144 | =over | |

145 | ||

146 | =item Arithmetic operators except, C<no integer> | |

147 | ||

148 | force the argument into the floating point format. | |

149 | ||

150 | =item Arithmetic operators except, C<use integer> | |

151 | ||

152 | =item Bitwise operators, C<no integer> | |

153 | ||

154 | force the argument into the integer format if it is not a string. | |

155 | ||

156 | =item Bitwise operators, C<use integer> | |

157 | ||

158 | force the argument into the integer format | |

159 | ||

160 | =item Operators which expect an integer | |

161 | ||

162 | force the argument into the integer format. This is applicable | |

163 | to the third and fourth arguments of C<sysread>, for example. | |

164 | ||

165 | =item Operators which expect a string | |

166 | ||

167 | force the argument into the string format. For example, this is | |

168 | applicable to C<printf "%s", $value>. | |

169 | ||

170 | =back | |

171 | ||

172 | Though forcing an argument into a particular form does not change the | |

173 | stored number, Perl remembers the result of such conversions. In | |

174 | particular, though the first such conversion may be time-consuming, | |

175 | repeated operations will not need to redo the conversion. | |

176 | ||

177 | =head1 AUTHOR | |

178 | ||

179 | Ilya Zakharevich C<ilya@math.ohio-state.edu> | |

180 | ||

181 | Editorial adjustments by Gurusamy Sarathy <gsar@ActiveState.com> | |

182 | ||

183 | =head1 SEE ALSO | |

184 | ||

055fd3a9 | 185 | L<overload> |