From 6cbb924831d50981620d4c51f8b12da5f269e569 Mon Sep 17 00:00:00 2001 From: Karl Williamson Date: Sun, 11 Sep 2016 09:40:37 -0600 Subject: [PATCH] perlapi: Reword description of is_utf8_valid_partial_char --- inline.h | 37 ++++++++++++++++++------------------- 1 file changed, 18 insertions(+), 19 deletions(-) diff --git a/inline.h b/inline.h index 5384281..41d0a9c 100644 --- a/inline.h +++ b/inline.h @@ -505,25 +505,24 @@ Perl_utf8_hop(const U8 *s, SSize_t off) =for apidoc is_utf8_valid_partial_char -Returns 1 if there exists some sequence of bytes, call it C, that when -appended to the sequence from C through S> causes the entire -sequence starting at C (including C) to be the well-formed UTF-8 of -some code point; otherwise returns 0. - -In other words this returns TRUE if C points to the beginning, but partial, -sequence of the UTF-8 for some code point. - -This is useful when some fixed-length buffer is being tested for being -well-formed UTF-8, but the final few bytes in it don't comprise a full -character: it is split somewhere in the middle of its UTF-8 representation. -(Presumably when the buffer is refreshed with the next chunk of data, the new -first bytes will complete the partial code point.) This function is used to -verify that the final bytes in the current buffer are in fact the legal -beginning of some code point, so that if they aren't, the failure can be -signalled without having to wait for the next read. - -If the bytes terminated at S> are a full character (or more), 0 is -returned. +Returns 0 if the sequence of bytes starting at C and looking no further than +S> is the UTF-8 encoding, as extended by Perl, for one or more code +points. Otherwise, it returns 1 if there exists at least one non-empty +sequence of bytes that when appended to sequence C, starting at position +C causes the entire sequence to be the well-formed UTF-8 of some code point; +otherwise returns 0. + +In other words this returns TRUE if C points to a partial UTF-8-encoded code +point. + +This is useful when a fixed-length buffer is being tested for being well-formed +UTF-8, but the final few bytes in it don't comprise a full character; that is, +it is split somewhere in the middle of the final code point's UTF-8 +representation. (Presumably when the buffer is refreshed with the next chunk +of data, the new first bytes will complete the partial code point.) This +function is used to verify that the final bytes in the current buffer are in +fact the legal beginning of some code point, so that if they aren't, the +failure can be signalled without having to wait for the next read. =cut */ -- 1.8.3.1