This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
PATCH: [perl #134064] Assertion failure in toke.c
The blamed commit simplified some code based on the assumption that
UTF-8 well-formedness had already been verified. It had an assertion to
verify this. The test case shows that there is a path, through 'eval',
that bypasses this usual checking.
The checking was based on the assumption that the program started not in
UTF-8, and something like a 'use utf8' would be needed to get it there,
at which point a flag would be set to the effect that well-formedness
should be checked. But it turns out that a string eval (perhaps other
things) gets parsed separately and so the flag wasn't set, so no
well-formedness checking was being done.
The solution is a one word change, to initialize the flag to TRUE "'yes,
check" instead of FALSE "no, don't check" in the initialization routine
run at the beginning of lexing a code unit. This catches eval and
presumably anything else that was being bypassed.
The checking is only actually done if the code being lexed is known to
be in UTF-8. This will continue to get turned on by the ways it
currently gets turned on, such as 'use utf8'.