The UTF8_CHECK_ONLY flag overrides the behavior when a non-allowed (by other
flags) malformation is found. If this flag is set, the routine assumes that
the caller will raise a warning, and this function will silently just set
-C<retlen> to C<-1> and return zero.
+C<retlen> to C<-1> (cast to C<STRLEN>) and return zero.
+
+Note that this API requires disambiguation between successful decoding a NUL
+character, and an error return (unless the UTF8_CHECK_ONLY flag is set), as
+in both cases, 0 is returned. To disambiguate, upon a zero return, see if the
+first byte of C<s> is 0 as well. If so, the input was a NUL; if not, the input
+had an error.
Certain code points are considered problematic. These are Unicode surrogates,
Unicode non-characters, and code points above the Unicode maximum of 0x10FFFF.
if (!method) { /* demand load utf8 */
ENTER;
errsv_save = newSVsv(ERRSV);
+ SAVEFREESV(errsv_save);
/* It is assumed that callers of this routine are not passing in
* any user derived data. */
/* Need to do this after save_re_context() as it will set
* PL_tainted to 1 while saving $1 etc (see the code after getrx:
* in Perl_magic_get). Even line to create errsv_save can turn on
* PL_tainted. */
- SAVEBOOL(PL_tainted);
- PL_tainted = 0;
+#ifndef NO_TAINT_SUPPORT
+ SAVEBOOL(TAINT_get);
+ TAINT_NOT;
+#endif
Perl_load_module(aTHX_ PERL_LOADMOD_NOIMPORT, newSVpvn(pkg,pkg_len),
NULL);
if (!SvTRUE(ERRSV))
sv_setsv(ERRSV, errsv_save);
- SvREFCNT_dec(errsv_save);
LEAVE;
}
SPAGAIN;
mPUSHi(none);
PUTBACK;
errsv_save = newSVsv(ERRSV);
+ SAVEFREESV(errsv_save);
/* If we already have a pointer to the method, no need to use
* call_method() to repeat the lookup. */
if (method ? call_sv(MUTABLE_SV(method), G_SCALAR)
}
if (!SvTRUE(ERRSV))
sv_setsv(ERRSV, errsv_save);
- SvREFCNT_dec(errsv_save);
LEAVE;
POPSTACK;
if (IN_PERL_COMPILETIME) {
C<s2>.
If C<pe1> is non-NULL and the pointer it points to is not NULL, that pointer is
-considered an end pointer beyond which scanning of C<s1> will not continue under
-any circumstances. This means that if both C<l1> and C<pe1> are specified, and
-C<pe1>
+considered an end pointer to the position 1 byte past the maximum point
+in C<s1> beyond which scanning will not continue under any circumstances.
+(This routine assumes that UTF-8 encoded input strings are not malformed;
+malformed input can cause it to read past C<pe1>).
+This means that if both C<l1> and C<pe1> are specified, and C<pe1>
is less than C<s1>+C<l1>, the match will never be successful because it can
never
get as far as its goal (and in fact is asserted against). Correspondingly for