| 1 | package Digest; |
| 2 | |
| 3 | use strict; |
| 4 | use vars qw($VERSION %MMAP $AUTOLOAD); |
| 5 | |
| 6 | $VERSION = "1.15"; |
| 7 | |
| 8 | %MMAP = ( |
| 9 | "SHA-1" => ["Digest::SHA1", ["Digest::SHA", 1], ["Digest::SHA2", 1]], |
| 10 | "SHA-224" => [["Digest::SHA", 224]], |
| 11 | "SHA-256" => [["Digest::SHA", 256], ["Digest::SHA2", 256]], |
| 12 | "SHA-384" => [["Digest::SHA", 384], ["Digest::SHA2", 384]], |
| 13 | "SHA-512" => [["Digest::SHA", 512], ["Digest::SHA2", 512]], |
| 14 | "HMAC-MD5" => "Digest::HMAC_MD5", |
| 15 | "HMAC-SHA-1" => "Digest::HMAC_SHA1", |
| 16 | "CRC-16" => [["Digest::CRC", type => "crc16"]], |
| 17 | "CRC-32" => [["Digest::CRC", type => "crc32"]], |
| 18 | "CRC-CCITT" => [["Digest::CRC", type => "crcccitt"]], |
| 19 | ); |
| 20 | |
| 21 | sub new |
| 22 | { |
| 23 | shift; # class ignored |
| 24 | my $algorithm = shift; |
| 25 | my $impl = $MMAP{$algorithm} || do { |
| 26 | $algorithm =~ s/\W+//; |
| 27 | "Digest::$algorithm"; |
| 28 | }; |
| 29 | $impl = [$impl] unless ref($impl); |
| 30 | my $err; |
| 31 | for (@$impl) { |
| 32 | my $class = $_; |
| 33 | my @args; |
| 34 | ($class, @args) = @$class if ref($class); |
| 35 | no strict 'refs'; |
| 36 | unless (exists ${"$class\::"}{"VERSION"}) { |
| 37 | eval "require $class"; |
| 38 | if ($@) { |
| 39 | $err ||= $@; |
| 40 | next; |
| 41 | } |
| 42 | } |
| 43 | return $class->new(@args, @_); |
| 44 | } |
| 45 | die $err; |
| 46 | } |
| 47 | |
| 48 | sub AUTOLOAD |
| 49 | { |
| 50 | my $class = shift; |
| 51 | my $algorithm = substr($AUTOLOAD, rindex($AUTOLOAD, '::')+2); |
| 52 | $class->new($algorithm, @_); |
| 53 | } |
| 54 | |
| 55 | 1; |
| 56 | |
| 57 | __END__ |
| 58 | |
| 59 | =head1 NAME |
| 60 | |
| 61 | Digest - Modules that calculate message digests |
| 62 | |
| 63 | =head1 SYNOPSIS |
| 64 | |
| 65 | $md5 = Digest->new("MD5"); |
| 66 | $sha1 = Digest->new("SHA-1"); |
| 67 | $sha256 = Digest->new("SHA-256"); |
| 68 | $sha384 = Digest->new("SHA-384"); |
| 69 | $sha512 = Digest->new("SHA-512"); |
| 70 | |
| 71 | $hmac = Digest->HMAC_MD5($key); |
| 72 | |
| 73 | =head1 DESCRIPTION |
| 74 | |
| 75 | The C<Digest::> modules calculate digests, also called "fingerprints" |
| 76 | or "hashes", of some data, called a message. The digest is (usually) |
| 77 | some small/fixed size string. The actual size of the digest depend of |
| 78 | the algorithm used. The message is simply a sequence of arbitrary |
| 79 | bytes or bits. |
| 80 | |
| 81 | An important property of the digest algorithms is that the digest is |
| 82 | I<likely> to change if the message change in some way. Another |
| 83 | property is that digest functions are one-way functions, that is it |
| 84 | should be I<hard> to find a message that correspond to some given |
| 85 | digest. Algorithms differ in how "likely" and how "hard", as well as |
| 86 | how efficient they are to compute. |
| 87 | |
| 88 | Note that the properties of the algorithms change over time, as the |
| 89 | algorithms are analyzed and machines grow faster. If your application |
| 90 | for instance depends on it being "impossible" to generate the same |
| 91 | digest for a different message it is wise to make it easy to plug in |
| 92 | stronger algorithms as the one used grow weaker. Using the interface |
| 93 | documented here should make it easy to change algorithms later. |
| 94 | |
| 95 | All C<Digest::> modules provide the same programming interface. A |
| 96 | functional interface for simple use, as well as an object oriented |
| 97 | interface that can handle messages of arbitrary length and which can |
| 98 | read files directly. |
| 99 | |
| 100 | The digest can be delivered in three formats: |
| 101 | |
| 102 | =over 8 |
| 103 | |
| 104 | =item I<binary> |
| 105 | |
| 106 | This is the most compact form, but it is not well suited for printing |
| 107 | or embedding in places that can't handle arbitrary data. |
| 108 | |
| 109 | =item I<hex> |
| 110 | |
| 111 | A twice as long string of lowercase hexadecimal digits. |
| 112 | |
| 113 | =item I<base64> |
| 114 | |
| 115 | A string of portable printable characters. This is the base64 encoded |
| 116 | representation of the digest with any trailing padding removed. The |
| 117 | string will be about 30% longer than the binary version. |
| 118 | L<MIME::Base64> tells you more about this encoding. |
| 119 | |
| 120 | =back |
| 121 | |
| 122 | |
| 123 | The functional interface is simply importable functions with the same |
| 124 | name as the algorithm. The functions take the message as argument and |
| 125 | return the digest. Example: |
| 126 | |
| 127 | use Digest::MD5 qw(md5); |
| 128 | $digest = md5($message); |
| 129 | |
| 130 | There are also versions of the functions with "_hex" or "_base64" |
| 131 | appended to the name, which returns the digest in the indicated form. |
| 132 | |
| 133 | =head1 OO INTERFACE |
| 134 | |
| 135 | The following methods are available for all C<Digest::> modules: |
| 136 | |
| 137 | =over 4 |
| 138 | |
| 139 | =item $ctx = Digest->XXX($arg,...) |
| 140 | |
| 141 | =item $ctx = Digest->new(XXX => $arg,...) |
| 142 | |
| 143 | =item $ctx = Digest::XXX->new($arg,...) |
| 144 | |
| 145 | The constructor returns some object that encapsulate the state of the |
| 146 | message-digest algorithm. You can add data to the object and finally |
| 147 | ask for the digest. The "XXX" should of course be replaced by the proper |
| 148 | name of the digest algorithm you want to use. |
| 149 | |
| 150 | The two first forms are simply syntactic sugar which automatically |
| 151 | load the right module on first use. The second form allow you to use |
| 152 | algorithm names which contains letters which are not legal perl |
| 153 | identifiers, e.g. "SHA-1". If no implementation for the given algorithm |
| 154 | can be found, then an exception is raised. |
| 155 | |
| 156 | If new() is called as an instance method (i.e. $ctx->new) it will just |
| 157 | reset the state the object to the state of a newly created object. No |
| 158 | new object is created in this case, and the return value is the |
| 159 | reference to the object (i.e. $ctx). |
| 160 | |
| 161 | =item $other_ctx = $ctx->clone |
| 162 | |
| 163 | The clone method creates a copy of the digest state object and returns |
| 164 | a reference to the copy. |
| 165 | |
| 166 | =item $ctx->reset |
| 167 | |
| 168 | This is just an alias for $ctx->new. |
| 169 | |
| 170 | =item $ctx->add( $data ) |
| 171 | |
| 172 | =item $ctx->add( $chunk1, $chunk2, ... ) |
| 173 | |
| 174 | The string value of the $data provided as argument is appended to the |
| 175 | message we calculate the digest for. The return value is the $ctx |
| 176 | object itself. |
| 177 | |
| 178 | If more arguments are provided then they are all appended to the |
| 179 | message, thus all these lines will have the same effect on the state |
| 180 | of the $ctx object: |
| 181 | |
| 182 | $ctx->add("a"); $ctx->add("b"); $ctx->add("c"); |
| 183 | $ctx->add("a")->add("b")->add("c"); |
| 184 | $ctx->add("a", "b", "c"); |
| 185 | $ctx->add("abc"); |
| 186 | |
| 187 | Most algorithms are only defined for strings of bytes and this method |
| 188 | might therefore croak if the provided arguments contain chars with |
| 189 | ordinal number above 255. |
| 190 | |
| 191 | =item $ctx->addfile( $io_handle ) |
| 192 | |
| 193 | The $io_handle is read until EOF and the content is appended to the |
| 194 | message we calculate the digest for. The return value is the $ctx |
| 195 | object itself. |
| 196 | |
| 197 | The addfile() method will croak() if it fails reading data for some |
| 198 | reason. If it croaks it is unpredictable what the state of the $ctx |
| 199 | object will be in. The addfile() method might have been able to read |
| 200 | the file partially before it failed. It is probably wise to discard |
| 201 | or reset the $ctx object if this occurs. |
| 202 | |
| 203 | In most cases you want to make sure that the $io_handle is in |
| 204 | "binmode" before you pass it as argument to the addfile() method. |
| 205 | |
| 206 | =item $ctx->add_bits( $data, $nbits ) |
| 207 | |
| 208 | =item $ctx->add_bits( $bitstring ) |
| 209 | |
| 210 | The add_bits() method is an alternative to add() that allow partial |
| 211 | bytes to be appended to the message. Most users should just ignore |
| 212 | this method as partial bytes is very unlikely to be of any practical |
| 213 | use. |
| 214 | |
| 215 | The two argument form of add_bits() will add the first $nbits bits |
| 216 | from $data. For the last potentially partial byte only the high order |
| 217 | C<< $nbits % 8 >> bits are used. If $nbits is greater than C<< |
| 218 | length($data) * 8 >>, then this method would do the same as C<< |
| 219 | $ctx->add($data) >>. |
| 220 | |
| 221 | The one argument form of add_bits() takes a $bitstring of "1" and "0" |
| 222 | chars as argument. It's a shorthand for C<< $ctx->add_bits(pack("B*", |
| 223 | $bitstring), length($bitstring)) >>. |
| 224 | |
| 225 | The return value is the $ctx object itself. |
| 226 | |
| 227 | This example shows two calls that should have the same effect: |
| 228 | |
| 229 | $ctx->add_bits("111100001010"); |
| 230 | $ctx->add_bits("\xF0\xA0", 12); |
| 231 | |
| 232 | Most digest algorithms are byte based and for these it is not possible |
| 233 | to add bits that are not a multiple of 8, and the add_bits() method |
| 234 | will croak if you try. |
| 235 | |
| 236 | =item $ctx->digest |
| 237 | |
| 238 | Return the binary digest for the message. |
| 239 | |
| 240 | Note that the C<digest> operation is effectively a destructive, |
| 241 | read-once operation. Once it has been performed, the $ctx object is |
| 242 | automatically C<reset> and can be used to calculate another digest |
| 243 | value. Call $ctx->clone->digest if you want to calculate the digest |
| 244 | without resetting the digest state. |
| 245 | |
| 246 | =item $ctx->hexdigest |
| 247 | |
| 248 | Same as $ctx->digest, but will return the digest in hexadecimal form. |
| 249 | |
| 250 | =item $ctx->b64digest |
| 251 | |
| 252 | Same as $ctx->digest, but will return the digest as a base64 encoded |
| 253 | string. |
| 254 | |
| 255 | =back |
| 256 | |
| 257 | =head1 Digest speed |
| 258 | |
| 259 | This table should give some indication on the relative speed of |
| 260 | different algorithms. It is sorted by throughput based on a benchmark |
| 261 | done with of some implementations of this API: |
| 262 | |
| 263 | Algorithm Size Implementation MB/s |
| 264 | |
| 265 | MD4 128 Digest::MD4 v1.3 165.0 |
| 266 | MD5 128 Digest::MD5 v2.33 98.8 |
| 267 | SHA-256 256 Digest::SHA2 v1.1.0 66.7 |
| 268 | SHA-1 160 Digest::SHA v4.3.1 58.9 |
| 269 | SHA-1 160 Digest::SHA1 v2.10 48.8 |
| 270 | SHA-256 256 Digest::SHA v4.3.1 41.3 |
| 271 | Haval-256 256 Digest::Haval256 v1.0.4 39.8 |
| 272 | SHA-384 384 Digest::SHA2 v1.1.0 19.6 |
| 273 | SHA-512 512 Digest::SHA2 v1.1.0 19.3 |
| 274 | SHA-384 384 Digest::SHA v4.3.1 19.2 |
| 275 | SHA-512 512 Digest::SHA v4.3.1 19.2 |
| 276 | Whirlpool 512 Digest::Whirlpool v1.0.2 13.0 |
| 277 | MD2 128 Digest::MD2 v2.03 9.5 |
| 278 | |
| 279 | Adler-32 32 Digest::Adler32 v0.03 1.3 |
| 280 | CRC-16 16 Digest::CRC v0.05 1.1 |
| 281 | CRC-32 32 Digest::CRC v0.05 1.1 |
| 282 | MD5 128 Digest::Perl::MD5 v1.5 1.0 |
| 283 | CRC-CCITT 16 Digest::CRC v0.05 0.8 |
| 284 | |
| 285 | These numbers was achieved Apr 2004 with ActivePerl-5.8.3 running |
| 286 | under Linux on a P4 2.8 GHz CPU. The last 5 entries differ by being |
| 287 | pure perl implementations of the algorithms, which explains why they |
| 288 | are so slow. |
| 289 | |
| 290 | =head1 SEE ALSO |
| 291 | |
| 292 | L<Digest::Adler32>, L<Digest::CRC>, L<Digest::Haval256>, |
| 293 | L<Digest::HMAC>, L<Digest::MD2>, L<Digest::MD4>, L<Digest::MD5>, |
| 294 | L<Digest::SHA>, L<Digest::SHA1>, L<Digest::SHA2>, L<Digest::Whirlpool> |
| 295 | |
| 296 | New digest implementations should consider subclassing from L<Digest::base>. |
| 297 | |
| 298 | L<MIME::Base64> |
| 299 | |
| 300 | http://en.wikipedia.org/wiki/Cryptographic_hash_function |
| 301 | |
| 302 | =head1 AUTHOR |
| 303 | |
| 304 | Gisle Aas <gisle@aas.no> |
| 305 | |
| 306 | The C<Digest::> interface is based on the interface originally |
| 307 | developed by Neil Winton for his C<MD5> module. |
| 308 | |
| 309 | This library is free software; you can redistribute it and/or |
| 310 | modify it under the same terms as Perl itself. |
| 311 | |
| 312 | Copyright 1998-2006 Gisle Aas. |
| 313 | Copyright 1995,1996 Neil Winton. |
| 314 | |
| 315 | =cut |