This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
XXX utf8.c: calculate vairants instead of assuming worst case
When converting a byte string to UTF-8, the needed size may increase due
to some bytes (the UTF-8 variants) occupying two bytes instead of one
under UTF-8.
Prior to this commit, the string was assumed to contain only variants,
and enough memory was allocated for the worst case, then the excess was
returned at the end.
This commit actually calculates how much space is needed and allocates
only that, so there is no need to trim afterwards.
There is extra work involved in doing this calculation. But the string
is parsed per-word. For short strings, it doesn't much matter either
way. But for very long strings, it seems to me the consequences of
potentially allocating way too much memory out weighs the negative of
this extra work.