perl5.git.perl.org Git - perl5.git/commit

author	Nicholas Clark <nick@ccl4.org>
	Thu, 22 Oct 2009 10:50:40 +0000 (11:50 +0100)
committer	Nicholas Clark <nick@ccl4.org>
	Thu, 22 Oct 2009 12:06:13 +0000 (13:06 +0100)
commit	c28d61051c446453c532f387d478df78d6f95c55
tree	b5269841b136d4b6de17e091386aa9621b76c683	tree \| snapshot
parent	9fb03e618192b6b5d49274cc64422acee51fe198	commit \| diff

Re-write S_utf16_textfilter() to correctly handle partial reads of UTF-16.

Treat any (and all) octects after the BOM (or all, if there was no BOM) as
initial read data for the filter, and call it to convert them to the first
line, reading more if necessary. This correctly handles the "problem" that
UTF-16LE read as a line, on the assumption that it's ASCII/ISO-8859-*/UTF-8/etc
will be truncated after the first octect of the "\n\0" pair that is "\n"
encoded as UTF-16LE. This fixes bug #69678.
Read from the upstream filter in block mode, rather than line mode.

t/comp/utf.t		diff \| blob \| blame \| history
toke.c		diff \| blob \| blame \| history