perlapi: Clarify NUL handling for 2 fcns; nits

author Karl Williamson <khw@cpan.org>

Wed, 23 Apr 2014 19:51:48 +0000 (13:51 -0600)

committer Karl Williamson <khw@cpan.org>

Wed, 23 Apr 2014 23:08:08 +0000 (17:08 -0600)
author Karl Williamson <khw@cpan.org>
Wed, 23 Apr 2014 19:51:48 +0000 (13:51 -0600)
committer Karl Williamson <khw@cpan.org>
Wed, 23 Apr 2014 23:08:08 +0000 (17:08 -0600)
diff --git a/utf8.c b/utf8.c

index fa5b4a7..dab5387 100644 (file)
--- a/utf8.c
+++ b/utf8.c
@@ -57,7 +57,9 @@ or not the string is encoded in UTF-8 (or UTF-EBCDIC on EBCDIC machines).  That
  is, if they are invariant.  On ASCII-ish machines, only ASCII characters
  fit this definition, hence the function's name.
  
-If C<len> is 0, it will be calculated using C<strlen(s)>.  
+If C<len> is 0, it will be calculated using C<strlen(s)>, (which means if you
+use this option, that C<s> can't have embedded C<NUL> characters and has to
+have a terminating C<NUL> byte).
  
  See also L</is_utf8_string>(), L</is_utf8_string_loclen>(), and L</is_utf8_string_loc>().
  
@@ -401,9 +403,9 @@ Perl_is_utf8_char(const U8 *s)
  
  Returns true if the first C<len> bytes of string C<s> form a valid
  UTF-8 string, false otherwise.  If C<len> is 0, it will be calculated
-using C<strlen(s)> (which means if you use this option, that C<s> has to have a
-terminating NUL byte).  Note that all characters being ASCII constitute 'a
-valid UTF-8 string'.
+using C<strlen(s)> (which means if you use this option, that C<s> can't have
+embedded C<NUL> characters and has to have a terminating C<NUL> byte).  Note
+that all characters being ASCII constitute 'a valid UTF-8 string'.
  
  See also L</is_ascii_string>(), L</is_utf8_string_loclen>(), and L</is_utf8_string_loc>().
  
@@ -548,11 +550,11 @@ flags) malformation is found.  If this flag is set, the routine assumes that
  the caller will raise a warning, and this function will silently just set
  C<retlen> to C<-1> (cast to C<STRLEN>) and return zero.
  
-Note that this API requires disambiguation between successful decoding a NUL
+Note that this API requires disambiguation between successful decoding a C<NUL>
  character, and an error return (unless the UTF8_CHECK_ONLY flag is set), as
  in both cases, 0 is returned.  To disambiguate, upon a zero return, see if the
-first byte of C<s> is 0 as well.  If so, the input was a NUL; if not, the input
-had an error.
+first byte of C<s> is 0 as well.  If so, the input was a C<NUL>; if not, the
+input had an error.
  
  Certain code points are considered problematic.  These are Unicode surrogates,
  Unicode non-characters, and code points above the Unicode maximum of 0x10FFFF.
@@ -1400,7 +1402,7 @@ UTF-8.
  Returns a pointer to the newly-created string, and sets C<len> to
  reflect the new length in bytes.
  
-A NUL character will be written after the end of the string.
+A C<NUL> character will be written after the end of the string.
  
  If you want to convert to UTF-8 from encodings other than
  the native (Latin1 or EBCDIC),
author	Karl Williamson <khw@cpan.org>
	Wed, 23 Apr 2014 19:51:48 +0000 (13:51 -0600)
committer	Karl Williamson <khw@cpan.org>
	Wed, 23 Apr 2014 23:08:08 +0000 (17:08 -0600)