This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Use real illegal UTF-8 byte
authorKarl Williamson <public@khwilliamson.com>
Tue, 19 Feb 2013 22:13:19 +0000 (15:13 -0700)
committerKarl Williamson <public@khwilliamson.com>
Thu, 29 Aug 2013 15:55:52 +0000 (09:55 -0600)
commite7214ce8dd2816e52abdfe522e7ff5adc81ba23e
treeb0aae3d2fe1253cb4d4269fd6be1d6186b7ff9f4
parent069727664af50f1d767b5928bad25bcb51f0644c
Use real illegal UTF-8 byte

The code here was wrong in assuming that \xFF is not legal in UTF-8
encoded strings.  It currently doesn't work due to a bug, but that may
eventually be fixed: [perl #116867].  The comments are also wrong that
all bytes are legal in UTF-EBCDIC.

It turns out that in well-formed UTF-8, the bytes C0 and C1 never appear
(C2, C3, and C4 as well in UTF-EBCDIC), as they would be the start byte
of an illegal overlong sequence.

This creates a #define for an illegal byte using one of the real illegal
ones, and changes the code to use that.

No test is included due to #116867.
op.c
toke.c
utf8.h