This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
toke.c: Ignore 'use encoding' on \N{}
authorKarl Williamson <khw@cpan.org>
Mon, 24 Nov 2014 04:50:41 +0000 (21:50 -0700)
committerKarl Williamson <khw@cpan.org>
Mon, 24 Nov 2014 17:50:26 +0000 (10:50 -0700)
The encoding pragma converts from a specified encoding into Unicode.
\N{} already returns the Unicode form, so the encoding pragma
should not operate on them.  This commit ensures that.  The only reason
things have appeared to work prior to this commit is that \N{} has
generally returned its value in UTF-8, which 'encoding' knows enough to
not disturb.  However, a custom name translator installed in the program
need not return in UTF-8, so this is a bug that just hasn't yet been
exposed.

However, the next commit is about to change things so that a regular
\N{} only returns UTF-8 if it has to, so this bug would come up a lot
more often.  There is no need for adding a test case, because, without
this commit existing tests would fail in t/uni/greek.t.

toke.c

diff --git a/toke.c b/toke.c
index cb14570..059c463 100644 (file)
--- a/toke.c
+++ b/toke.c
@@ -3476,7 +3476,7 @@ S_scan_const(pTHX_ char *start)
                            d = off + SvGROW(sv, off + len + (STRLEN)(send - s) + 1);
                        }
                         if (! SvUTF8(res)) {    /* Make sure \N{} return is UTF-8 */
-                            sv_utf8_upgrade(res);
+                            sv_utf8_upgrade_flags(res, SV_UTF8_NO_ENCODING);
                             str = SvPV_const(res, len);
                         }
                        Copy(str, d, len, char);