pp_negate and the Unicode Bug
$ ./perl -Ilib -Mutf8 -CO -le 'print -"3 apples"'
-3
$ ./perl -Ilib -Mutf8 -CO -le 'print -"3 μῆλα"'
-3 μῆλα
This has been this way since 5.10.1. In 5.10.0, it was consistent:
$ perl5.10.0 -Mutf8 -CO -le 'print -"3 apples"'
-3
$ perl5.10.0 -Mutf8 -CO -le 'print -"3 μῆλα"'
-3
But the worst part is that we get a non-numeric warning now for a
string operation:
$ perl5.10.1 -Mutf8 -CO -lwe 'print -"3 μῆλα"'
Argument "\x{33}\x{20}..." isn't numeric in negation (-) at -e line 1.
-3 μῆλα
This goes back to commit
a43d94f2c089, which by itself looks perfectly
correct (I won’t quote the diff here, as it is long; but it doesn’t
touch pp_negate):
commit
a43d94f2c089c6f14197795daeebb7835550a747
Author: Nicholas Clark <nick@ccl4.org>
Date: Mon Jan 7 18:24:39 2008 +0000
Don't set the public IV or NV flags if the string converted from has
trailing garbage. This behaviour is consistent with not setting the
public IV or NV flags if the value is out of range for the type.
p4raw-id: //depot/perl@32894
It seems that pp_negate was already buggy before that (or ‘validly’
assumed that numeric coercion would set public flags). And it looks
as though commit
8eb28a70b2e is at fault here.
It changed this:
$ perl5.6.2 -Mutf8 -lwe 'print -"ð"'
-ð
to this:
$ perl5.8.1 -Mutf8 -lwe 'print -"ð"'
Argument "\x{f0}" isn't numeric in negation (-) at -e line 1.
0
to comply with what happens when the UTF8 flag is not set. But it was
relying on bugs in sv_2iv, etc.
So, from 5.8.0 to 5.10.0 inclusive, unary negation prepends "-" if the
string begins with [A-Za-z], but from 5.10.1 onwards it behaves diffe-
rently depending on the internal UTF8 flag (even prepending "-" to
ASCII-only strings like "%apples" if the UTF8 flag is on).
This commit restores the 5.8.0-5.10.0 behaviour, which was at least
self-consistent.