This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Make utf8_to_uvchr() slightly safer
authorKarl Williamson <khw@cpan.org>
Tue, 31 Jul 2018 03:41:44 +0000 (21:41 -0600)
committerKarl Williamson <khw@cpan.org>
Fri, 3 Aug 2018 19:13:24 +0000 (13:13 -0600)
Recent commit aa3c16bd709ef9b9c8c785af48f368e08f70c74b made this
function safe if the input is a NUL-terminated string.  But if not, it
can read past the end of the buffer.  It used as a limit the maximum
length a UTF-8 code point can be.  But most code points in real-world
use aren't nearly that long, and we know how long that can be by looking
at the first byte.  Therefore, use the length determined by the first
byte as the limit instead of the maximum possible.

utf8.c

diff --git a/utf8.c b/utf8.c
index ceb8ed8..06b7768 100644 (file)
--- a/utf8.c
+++ b/utf8.c
@@ -5755,8 +5755,8 @@ Perl_utf8_to_uvchr(pTHX_ const U8 *s, STRLEN *retlen)
     }
 
     return utf8_to_uvchr_buf(s,
-                             s + my_strnlen((char *) s, UTF8_MAXBYTES),
-                            retlen);
+                             s + my_strnlen((char *) s, UTF8SKIP(s)),
+                             retlen);
 }
 
 /*