This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Make pos less volatile when UTF8-ness can change
authorFather Chrysostomos <sprout@cpan.org>
Wed, 26 Sep 2012 03:33:30 +0000 (20:33 -0700)
committerFather Chrysostomos <sprout@cpan.org>
Mon, 1 Oct 2012 19:51:50 +0000 (12:51 -0700)
commit57e30c7a9fde9284d1ac854d4d6f97863ff461f5
tree7072a88af0c78927624197c085c73e0b5217f209
parent033de87f380b64bf9f558f5d5c412d31230040c0
Make pos less volatile when UTF8-ness can change

This was brought up in ticket #114690.

pos checks the length of the string and then its UTF8-ness.  But the
UTF8-ness is not updated by length magic.  So it can get confused if
simply stringifying a match var happens to flip the UTF8 flag:

$ perl -le '"\x{100}a" =~ /(..)/; pos($1) = 2; print pos($1); "$1";
print pos($1)'
2
1

$ perl -le '"\x{100}a" =~ /(.)/; pos($1) = 2; print pos($1); "$1"; print
pos($1)'
1
Malformed UTF-8 character (unexpected end of string) in match position
at -e line 1.
0

As pointed out in that ticket, length magic on scalars cannot work
properly with UTF8, so stop using it.
sv.c
t/op/pos.t