- * In the first method, we can allocate a new string, do the memory
- * copy from the s to t - 1, and then proceed through the rest of the
- * string byte-by-byte.
- *
- * In the second method, we proceed through the rest of the input
- * string just calculating how big the converted string will be. Then
- * there are two cases:
- * 1) if the string has enough extra space to handle the converted
- * value. We go backwards through the string, converting until we
- * get to the position we are at now, and then stop. If this
- * position is far enough along in the string, this method is
- * faster than the first method above. If the memory copy were
- * the same speed as the byte-by-byte loop, that position would be
- * about half-way, as at the half-way mark, parsing to the end and
- * back is one complete string's parse, the same amount as
- * starting over and going all the way through. Actually, it
- * would be somewhat less than half-way, as it's faster to just
- * count bytes than to also copy, and we don't have the overhead
- * of allocating a new string, changing the scalar to use it, and
- * freeing the existing one. But if the memory copy is fast, the
- * break-even point is somewhere after half way. The counting
- * loop could be sped up by vectorization, etc, to move the
- * break-even point further towards the beginning.
- * 2) if the string doesn't have enough space to handle the converted
- * value. A new string will have to be allocated, and one might
- * as well, given that, start from the beginning doing the first
- * method. We've spent extra time parsing the string and in
- * exchange all we've gotten is that we know precisely how big to
- * make the new one. Perl is more optimized for time than space,
- * so this case is a loser.
- * So what I've decided to do is not use the 2nd method unless it is
- * guaranteed that a new string won't have to be allocated, assuming
- * the worst case. I also decided not to put any more conditions on it
- * than this, for now. It seems likely that, since the worst case is
- * twice as big as the unknown portion of the string (plus 1), we won't
- * be guaranteed enough space, causing us to go to the first method,
- * unless the string is short, or the first variant character is near
- * the end of it. In either of these cases, it seems best to use the
- * 2nd method. The only circumstance I can think of where this would
- * be really slower is if the string had once had much more data in it
- * than it does now, but there is still a substantial amount in it */
+ * The problem with assuming the worst case scenario is that for very
+ * long strings, we could allocate much more memory than actually
+ * needed, which can create performance problems. If we have to parse
+ * anyway, the second method is the winner as it may avoid an extra
+ * copy. The code used to use the first method under some
+ * circumstances, but now that there is faster variant counting on
+ * ASCII platforms, the second method is used exclusively, eliminating
+ * some code that no longer has to be maintained. */