-The C<-T> and C<-B> switches work as follows. The first block or so of the
-file is examined for odd characters such as strange control codes or
-characters with the high bit set. If too many strange characters (>30%)
-are found, it's a C<-B> file; otherwise it's a C<-T> file. Also, any file
-containing a zero byte in the first block is considered a binary file. If C<-T>
-or C<-B> is used on a filehandle, the current IO buffer is examined
+The C<-T> and C<-B> switches work as follows. The first block or so of
+the file is examined to see if it is valid UTF-8 that includes non-ASCII
+characters. If, so it's a C<-T> file. Otherwise, that same portion of
+the file is examined for odd characters such as strange control codes or
+characters with the high bit set. If more than a third of the
+characters are strange, it's a C<-B> file; otherwise it's a C<-T> file.
+Also, any file containing a zero byte in the examined portion is
+considered a binary file. (If executed within the scope of a L<S<use
+locale>|perllocale> which includes C<LC_CTYPE>, odd characters are
+anything that isn't a printable nor space in the current locale.) If
+C<-T> or C<-B> is used on a filehandle, the current IO buffer is
+examined