This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Nicolas R [Mon, 29 Aug 2022 21:24:32 +0000 (21:24 +0000)]
Promote v5.36 usage and feature bundles doc
Promote the use of 'v5.36' instead of 'v5.10'.
Also point to the existing Cheat Sheet for the
feature bundle.
For consistency also removed the final dot in several
'head2' feature title.
Elvin Aslanov [Tue, 30 Aug 2022 07:05:41 +0000 (11:05 +0400)]
[doc] Update File::Basename synopsis
Add `my` on examples.
Elvin Aslanov [Tue, 30 Aug 2022 07:16:02 +0000 (11:16 +0400)]
Update File::Spec synopsis
Add `my` to examples for better practice.
Elvin Aslanov [Tue, 30 Aug 2022 07:32:25 +0000 (11:32 +0400)]
[doc] Update FileHandle synopsis
Add `my` to examples for better practice.
Elvin Aslanov [Tue, 30 Aug 2022 07:53:57 +0000 (11:53 +0400)]
[doc] Update File::Copy synopsis
Add `my` to examples for better practice.
Elvin Aslanov [Tue, 30 Aug 2022 08:13:25 +0000 (12:13 +0400)]
[doc] Update IO::Handle synopsis
Add `my` to examples for better practice.
Yves Orton [Mon, 29 Aug 2022 11:57:00 +0000 (13:57 +0200)]
perlinterp.pod - minor enhancements of the docs about JMPENV_ macros
Added a bit more on the levels and return codes. 1 is actually used
in the outmost call, or at least there is code to support it. Also
added "level" info, since that is used in -Dl.
Elvin Aslanov [Tue, 30 Aug 2022 08:09:02 +0000 (12:09 +0400)]
Update IO::File synopsis
Add `my` to examples for better practice.
Elvin Aslanov [Tue, 30 Aug 2022 08:11:05 +0000 (12:11 +0400)]
Update IO::Dir synopsis
Add `my` to examples for better practice.
Yves Orton [Sat, 27 Aug 2022 11:57:05 +0000 (13:57 +0200)]
deb.c - when PL_copline is NOLINE show 0 not 2^32-1.
NOLINE is defined to be U32_MAX (not sure why it is not 0,
as lines in files by convention are numbered from 1 so 0 should
be an illegal line number), some parts of our code explicitly
set the cop_line to be NOLINE. When we debug we should show this
as line 0, the U32_MAX value is somewhat confusing. We could also
just not show a line at all, but for now this makes more sense,
especially as the deb.c logic was using 0 when there is no curcop.
Without this patch:
$ ./perl -Dl -e'eval "1+;"x10'
(-e:0) ENTER scope 2 (savestack=38) at op.c:10897
(-e:
4294967295) LEAVE scope 2 (savestack=46) at op.c:10937
(-e:
4294967295) savestack: releasing items 46 -> 38
(-e:0) ENTER scope 2 (savestack=40) at perly.c:289
(-e:1) ENTER scope 3 (savestack=78) at toke.c:4868
With this patch:
$ ./perl -Dl -e'eval "1+;"x10'
(-e:0) ENTER scope 2 (savestack=38) at op.c:10897
(-e:0) LEAVE scope 2 (savestack=46) at op.c:10937
(-e:0) savestack: releasing items 46 -> 38
(-e:0) ENTER scope 2 (savestack=40) at perly.c:289
(-e:1) ENTER scope 3 (savestack=78) at toke.c:4868
This fixes GH Issue #20175
Karl Williamson [Thu, 18 Aug 2022 16:14:30 +0000 (10:14 -0600)]
diag.t: subDivide exception list
Some of the entries in this list of diagnostics are ones that people
just haven't gotten around to documenting.
Patches welcome!
But there are, in my opinion, a very few are ones that don't ever need
to be documented because the text of the diganostic is sufficiently
explanatory in and of itself.
This commit revises the comments explaining the exception list, and
moves the ones I think are sufficiently self-explanatory to the front.
Karl Williamson [Sat, 27 Aug 2022 20:40:56 +0000 (14:40 -0600)]
Use LIFO in Destroying locale subsystem
I noticed this in code reading. I don't know if it is currently a
problem, but at destruction time, the last created should be the first
destroyed, as nothing should be depending on it (since everything else
was created before it), but it and its destruction could be depending on
things created earlier.
Dagfinn Ilmari Mannsåker [Sun, 28 Aug 2022 13:09:49 +0000 (14:09 +0100)]
Fix POD formatting error in perl5372delta
Yves Orton [Sat, 27 Aug 2022 06:49:58 +0000 (08:49 +0200)]
retainedlines.t - deterministic results and fixup tests under failure
The existing logic is only "correct" when all the tests pass. When they
fail the tests are revealed to be non-determinstic. The other problem is
that once a single test case fails and leaks an entry in the stash the
following tests are all contaminated and trigger "false failures". The
combination makes things look much more broken than they are.
This patch fixes the tests so that they are deterministic and the
"leaking" between test cases is stopped.
See GH Issue #20174
Kenneth Olwing [Thu, 25 Aug 2022 16:48:26 +0000 (18:48 +0200)]
Change optimization level for Win32 builds
This fixes #20136.
Building on Windows 11 with the Strawberry 5.32.1 (gcc 8.3.0) toolchain,
multiple errors in the tests are seen. Worse, building on Windows 10
no test errors crop up, but the resulting perl will still crash and die
when run the tests manually on Windows 11.
Changing the optimization level to -Os as found in #20024, the build now
and tests now succeed.
Kenneth Olwing [Sat, 27 Aug 2022 19:02:52 +0000 (21:02 +0200)]
Custom timeout in watchdog in -T tests fails
This fixes #20173
The recent change in the test.pl/watchdog() in #20134 causes
re/substT.t .......................................................... ok
and
perf/taint.t ......................................................... ok
This occurs when setting the PERL_TEST_TIME_OUT_FACTOR environment value in the build
which causes the $timeout_factor to become tainted.
Untainting it fixes the problem.
Yves Orton [Fri, 26 Aug 2022 05:44:41 +0000 (07:44 +0200)]
gv.c - use SVf_QUOTEDPREFIX in error message
I overlooked this case when SVf_QUOTEDPREFIX was introduced
Yves Orton [Fri, 26 Aug 2022 07:23:46 +0000 (09:23 +0200)]
perl.h - move defines out of incorrect ifdef
Not sure how that happened in the original commit, but the
PERL_RAND_SEED related defines should not be conditional on anything
related to doubles.
James E Keenan [Fri, 26 Aug 2022 16:49:02 +0000 (16:49 +0000)]
Change variable names in test
To avoid "subroutine redefined" warning.
For: https://github.com/Perl/perl5/issues/20164
Elvin Aslanov [Thu, 25 Aug 2022 08:47:43 +0000 (12:47 +0400)]
[doc] Update File::stat synopsis
Added `my` to Synopsis for better practice.
Repository points to GH:
https://metacpan.org/pod/File::stat
Karl Williamson [Tue, 26 Jul 2022 12:56:05 +0000 (06:56 -0600)]
POSIX: mbrlen,mbtowc: Use SvPVbyte not SvPV
It's better form to explicitly use the 'byte' form. Just above, we
tested that the byte form is valid.
Richard Leach [Thu, 25 Aug 2022 15:07:26 +0000 (15:07 +0000)]
Perldelta for GH#20077 (addition to existing list)
Richard Leach [Sun, 7 Aug 2022 23:34:42 +0000 (23:34 +0000)]
Add OPpTARGET_MY optimization to OP_UNDEF
This allows the existing `undef` OP to act on a pad SV. The following
two cases are optimized:
`undef my $x`, currently implemented as:
4 <1> undef vK/1 ->5
3 <0> padsv[$x:1,2] sRM/LVINTRO ->4
`my $a = undef`, currently implemented as:
5 <2> sassign vKS/2 ->6
3 <0> undef s ->4
4 <0> padsv[$x:1,2] sRM*/LVINTRO ->5
These are now just represented as:
3 <1> undef[$x:1,2] vK/SOMEFLAGS ->4
Note: The two cases are not quite functionally identical, as `$x = undef`
clears the SV flags but preserves any PV allocation for later reuse,
whereas `undef $x` does free any PV allocation. This behaviour difference
is preserved through use of the OPpUNDEF_KEEP_PV flag.
Karl Williamson [Thu, 25 Aug 2022 02:47:55 +0000 (20:47 -0600)]
run/locale.t: skip illegal locale test when all are legal
OpenBSD believes that locales are so inherently insecure that it only
internally has two locales: C and C.UTF-8, which it knows how to treat
securely.
However, to preserve at least the illusion of portability, the locale
setting operations do not fail whatever garbage is passed as the locale
name. (There may be unacceptable strings, but any reasonable locale
name and a lot of unreasonable ones are accepted.) If the input name
looks like it was meant to have "UTF-8" in its name, the locale is set
to C.UTF-8. Otherwise, it is set to C. Early releases had bugs in
switching between the two.
I wrote a Configure probe some years ago to detect this situation.
I forgot about this fact in adapting Bram's test case. It should be
skipped on such platforms.
This fixes #20149
Yves Orton [Mon, 1 Aug 2022 14:21:12 +0000 (16:21 +0200)]
sv.c - add a _QUOTEDPREFIX version of SVf, UTF8f, and HEKf for use in error messages.
These new formats are intended to be used in error messages where
we want to show the contents of a string without any possible
hidden characters not rendering in the error message, and where
it would be unreasonable to show every character of the string
if it is very long.
A good example would be when we want to say that a class name is
illegal. Consider:
"Foo\0"->thing()
should not throw an error message about "Foo" being missing, the fact
there is a null in there should be visible to the developer.
Similarly if we had
("x" x 1000_000)->thing()
we also do not want to throw a 1MB error message as it is generally
just unhelpful, a class name that long is almost certainly a mistake.
Currently this patch restricts it to showing 256 characters, the first
128 followed by an ellipses followed by the last 128 characters, but the
docs are such that we can change that if we wish, I suspect something
like 100 would be more reasonable. You can override the define
PERL_QUOTEDPREFIX_LEN to a longer value in Configure if you wish.
Example usage:
other= newSVpvs("Some\0::Thing\n");
sv_catpvf(msg_sv,"%" SVf_QUOTEDPREFIX, SVfARG(other));
Should append
"Some\0::Thing\n"
to the msg_sv. If it were very long it would have ellipses infixed. The
class name "x" x 1_000_000 would show
Can't locate object method "non_existent_method" via \
package "x[repeated 128 times]"..."x[repeated 128 times]" \
(perhaps you forgot to load \
"x[repeated 128 times]"..."x[repeated 128 times]"?) at -e line 1.
(but obviously as one line with the literal text of the class instead of
"[repeated 128 times]")
This patch changes a variety of error messages that used to output the
full string always. I haven't changed every place that this could happen
yet, just the main ones related to method calls, subroutine names and
the like.
Kenneth Olwing [Tue, 16 Aug 2022 17:41:45 +0000 (19:41 +0200)]
Handle intrin files on win32 with gcc
This fixes #20033.
When building on Windows with Strawberry 5.32.1 (gcc 8.3.0) as the toolchain,
the Errno.pm is created by a script Errno_pm.pl, which takes output from the
compiler to find headers.
A subset of these headers requires them to only be included by some specific
headers. Previously the header order was effectively random and this
occasionally caused build errors (that further were never detected).
The get_files() is now returning the header names in the order the compiler
saw them which insures they are in the right order.
Paul "LeoNerd" Evans [Wed, 24 Aug 2022 15:25:34 +0000 (16:25 +0100)]
Add PUSHpvs("literal") macro family
This set of PUSH-style macros takes a string literal argument and pushes
it to the stack, optionally mortalizing it and/or extending the stack.
Previously, the best alternative was
mPUSHp("literal", 7);
which required the author to visually count the number of characters in
the string literal (7 in this case). These new macros fit the similar
pattern familiar to many functions such as `newSVpvs`, which takes a
string literal and counts it directly.
Karl Williamson [Wed, 24 Aug 2022 17:09:47 +0000 (11:09 -0600)]
makedef.pl: Rmv obsolete symbol
Spotted by Bram
Karl Williamson [Wed, 24 Aug 2022 16:25:33 +0000 (10:25 -0600)]
perldelta for changes to I18N::Langinfo
Karl Williamson [Thu, 4 Aug 2022 01:04:43 +0000 (19:04 -0600)]
perlapi: Add cautions about PL_Sv, PL_na
Karl Williamson [Mon, 22 Aug 2022 19:16:33 +0000 (13:16 -0600)]
Consolidate PERL_TEST_TIME_OUT_FACTOR to watchdog()
This changes test.pl watchdog() to always consider this potential
setting of an environment variable, and removes the distributed uses.
This means that no code needs to change when tests start failing on a
slow platform due to timing out; or when you need to temporarily
increase the timeout of a test for debugging. This will correspondingly
increase the timeout of all tests, but who cares for debugging purposes.
Bram [Tue, 23 Aug 2022 22:03:22 +0000 (00:03 +0200)]
updateAUTHORS.pm: quote the commit range
Several of the tests in t/porting/update_authors.t include
the '^' symbol in the commit range to refer to the parent of the
commit (or '^^^' to refer to grand-grand-parent).
On Windows '^' is a meta-character in cmd.exe which causes the
next character to be escaped (i.e. it's Windows variant of '\').
=> When running commands the commit range must be quoted (since
it might contain '^' characters). (An alternative fix: do
`s/\^/^^/g;` when running on Windows)
Bram [Tue, 23 Aug 2022 21:27:27 +0000 (23:27 +0200)]
TMP: Skip author tests when there is no git config
and/or when there are only untracked files.
The old t/porting/pending-authors.t test:
- ignored untracked files (i.e. when there were only untracked
files then the test was skipped)
- did not run when there was no git config (that is no name/email)
This behavior was not included when migrating to Porting/updateAUTHORS.p[lm]
which - currently - results in several smokers being unhappy :(
[Test::Smoke creates files in the build dir + several smokers do not have
a git configuration.]
For now: restore the old behavior until a better fix is in place.
(Note: in the test I didn't use `skip()` since that would've meant
adding a `SKIP` block and intending the code some more which I did
not want to do for a change that's going to be removed in the future.)
(List of failed smokers: https://perl5.test-smoke.org/submatrix?test=../t/porting/authors.t&pversion=5.37.4 )
Karl Williamson [Sat, 20 Aug 2022 17:13:39 +0000 (11:13 -0600)]
Use Windows system default locale
This fixes #20054
In order to make Perl programs more portable, the perl locale system on
Windows emulates POSIX behavior. But Windows has a default system
locale, missing from other platforms. This really should be considered
as a fallback when the POSIX behavior doesn't work.
And it was, for a few releases, as noted in the ticket. But
inadvertently broken in 5.28; and this commit now restores it.
Todd Rinaldo [Mon, 22 Aug 2022 19:35:08 +0000 (19:35 +0000)]
Bump Locale::Maketext version to 1.32 to match CPAN
This corrected a Makefile.PL bug specific only to the dual life version
related to where libraries were installed
See #20087 for more information.
H.Merijn Brand [Tue, 23 Aug 2022 11:14:32 +0000 (13:14 +0200)]
Update Config::Perl::V to version 0.34
Richard Leach [Mon, 22 Aug 2022 13:47:46 +0000 (13:47 +0000)]
rpeep: don't apply padsv_store and padrange together
As originally committed, the OP_PADSV_STORE optimization interacted
negatively with OP_PADRANGE:
1. The new rpeep code was buggy, as it assumed that oldop must be the
targ PADSV, when it could have been a padrange. In the first case,
updating `oldoldop->op_next = o` is correct, in the second case the
op_next chain must be left as-is. That was easily fixable.
2. There was some problem with stack book-keeping - probably of the
mark stack. The following test case continued to fail even after
the rpeep code had been fixed:
my $x = {}; my $y; print keys %{$y = $x};
However, since both OP_PADSV_STORE and OP_PADRANGE optimize by taking
the targ PADSV out of the op_next chain, it was apparent that there is
reduced gain from having both optimizations applied to the same optree.
Therefore, the simple fix applied by this commit is to modify peep(),
such that the OP_PADSV_STORE optimization is not applied when OP_PADRANGE
has already been applied.
Existing tests did not pick up the problems, which were identified via
Blead-Breaks-CPAN reports. Additional tests have thus been included.
Bram [Thu, 11 Aug 2022 17:02:16 +0000 (19:02 +0200)]
win32: Remove trailing backslash from `INST_TOP`
When `INST_TOP` contains a trailing backslash then things go horribly wrong.
Example (output slightly altered for readability):
C:\...> gmake INST_TOP=C:\Perl\blead\perl\
...
..\miniperl.exe -I..\lib config_sh.PL
"INST_TOP=C:\Perl\blead\perl\"
"INST_VER="
"INST_ARCH="
"archname=MSWin32-x64-multi-thread"
"cc=gcc"
"ld=g++"
"ccflags= -DWIN32 -DWIN64 ...."
...
Use of uninitialized value $opt{"static_ext"} in split at config_sh.PL line 57.
...
Can't open -DWIN32: No such file or directory at config_sh.PL line 335.
...
..\miniperl.exe -I..\lib ..\configpm --chdir=..
Use of uninitialized value $t in string eq at ..\configpm line 345.
...
written lib/Config.pod
syntax error at lib/Config_heavy.pl line 165, near "x;"
Compilation failed in require at ..\configpm line 1144.
gmake: *** [GNUmakefile:1195: ..\lib\Config.pm] Error 255
-> The trailing backslash in 'INST_TOP' caused the double quote (in
`miniperl.exe config_sh.PL`) to be escaped which messes up the
rest of the arguments/the argument parsing leading to the errors.
Avoid the errors by removing the trailing backslash.
(Tested on Windows 10 with GNU Make v4.2.1)
Bram [Sat, 20 Aug 2022 13:40:37 +0000 (15:40 +0200)]
Skip t/porting/authors.t on shallow clones
On shallow clones the history is not available.
Before:
The 'are we on a branch' logic was broken which caused this
test to assume it was always on a branch which caused it to
use: `git log HEAD^1..HEAD^2` | perl Porting/checkAUTHORS.pl --tap`.
That `git log` command returns *no commits* when HEAD is *not* a branch
and then the test passes.
In a shallow clone (clone without history) there is only one commit and
never a merge so this test silently did nothing on a shallow clone.
Intermediate:
In commit
16dd3f70cc16005d5af7146385733a7c945fb67e the 'are we on a branch'
logic was fixed and now it does do the right thing.
But this introduces a new problem: on a shallow clone there is no history
so there never is a merge commit. But GitHub Actions does *add* a merge
commit (in some cases) with a different author/committer name/emailaddress
(one which should not be in AUTHORS).
Result: the GitHub CI *failed* because they're using shallow clones.
Now:
When a shallow clone is detected just skip the entire test.
Before it already did that (by accident) now make this explicit.
Example git log:
my branch for which a PR was created:
$ git log --format=format:"%h %aN <%aE> - %s" --graph -n25
* 12a6316 Bram <perl-rt@wizbit.be> - TMP: Add some temporary debugging
* 8ac37a5 Bram <perl-rt@wizbit.be> - CI: Special case porting test in 'sanity check'
* 7f36725 Karl Williamson <khw@cpan.org> - makedef: Export certain symbols
GitHub CI with full git history (i.e. no shallow clone):
$ git log --format=format:"%h %aN <%aE> - %s" --graph -n25
*
99884eedeb Bram <
109858694+bram-perl@users.noreply.github.com> - Merge
12a6316de1a8834f07d76c59480bafc3fdfa0c66 into
7f367253e335e8507638bb2ca1767c0fedbc95d3
|\
| *
12a6316de1 Bram <perl-rt@wizbit.be> - TMP: Add some temporary debugging
| *
8ac37a5a03 Bram <perl-rt@wizbit.be> - CI: Special case porting test in 'sanity check'
|/
*
7f367253e3 Karl Williamson <khw@cpan.org> - makedef: Export certain symbols
GitHub CI without history (i.e. shallow clone):
$ git log --format=format:"%h %aN <%aE> - %s" --graph -n25
*
99884eede Bram <
109858694+bram-perl@users.noreply.github.com> - Merge
12a6316de1a8834f07d76c59480bafc3fdfa0c66 into
7f367253e335e8507638bb2ca1767c0fedbc95d3
For future reference: the git show info:
commit
99884eedebb6cccd1596023a94e32eb3aea6f5e8
Author: Bram <
109858694+bram-perl@users.noreply.github.com>
Commit: GitHub <noreply@github.com>
(I do *not* have that Author email-address configured)
Bram [Sat, 20 Aug 2022 14:56:50 +0000 (16:56 +0200)]
CI: Remove fetching of tags from sanity check
For the 'sanity check' CI the option `fetch-depth: 0` is used.
This causes it to fetch all branches *and* all tags with all history.
By doing `git fetch --depth=1 origin +refs/tags/*:refs/tags/*` it is
fetching all the tags again and in this case *discarding* all the history
of the tags and turning this into a shallow clone instead of a full clone.
-> Remove the fetching of tags from the sanity check CI since they were
already fetched with their full history.
(This is needed because in the next commit a check will be done to see
if the test is running in a shallow clone)
Karl Williamson [Mon, 22 Aug 2022 12:53:26 +0000 (06:53 -0600)]
Merge branch 'fourth batch of locale commits' into blead
This batch of commits deal largely with Perl_langinfo(), which is a
wrapper for libc nl_langinfo(). That function is missing from some
platforms, notably Windows., This commit series extends our emulation to
better cover the components previously missing. The CODESET item
detection is much improved on such systems. This allows us to reliably
determine if any given item is UTF-8 or not. New API functions are
created to allows XS code to directly access the values, instead of
having to go through POSIX.
Redundant implementations of mbtowc, mbrtwoc, strftime, and localeconv
are collapsed into one, eliminating some bugs found in one but not the
other
Karl Williamson [Tue, 9 Aug 2022 23:08:51 +0000 (17:08 -0600)]
Revert "XXX Temporarily skip on Windows"
This should now be fixed by intervening commits
Karl Williamson [Sun, 14 Aug 2022 17:13:58 +0000 (11:13 -0600)]
langinfo.t: Use mnemonic; Avoid $_
It is clearer and safer to use a mnemonic variable across several
statements.
Karl Williamson [Mon, 15 Aug 2022 13:10:09 +0000 (07:10 -0600)]
Langinfo.t: White-space comments only
Karl Williamson [Mon, 15 Aug 2022 13:00:34 +0000 (07:00 -0600)]
langinfo.t: Cope with bad system locale returns
On our current CI Windows box, the Albanian UTF-8 locale, at least, is
returning illegal-UTF-8; it appears to be 8859-2 instead. Prior to this
commit, the test would fail. This commit does some revamping to better
handle and not fail when there is a system bug with some isolated
locale, but to instead continue trying with other locales. And to give
better diagnostics as to what actually happens.
Karl Williamson [Mon, 15 Aug 2022 12:56:27 +0000 (06:56 -0600)]
Langinfo.t: Use different loc_tools function to simplify
find_utf8_ctype_locales() allows this to get rid of some kludgy code
Karl Williamson [Sun, 14 Aug 2022 21:27:38 +0000 (15:27 -0600)]
locale.c: Add fallbacks if no mbtowc()
This add heuristics that work well for non-English locales to determine
if a locale is UTF-8 or not when mbtowc() isn't available. It would be
a very rare compiler that didn't have that these days, but this covers
that case as best as I have been able to figure out.
Karl Williamson [Thu, 4 Aug 2022 21:48:13 +0000 (15:48 -0600)]
run/locale.t: Use langinfo not localeconv
Karl Williamson [Thu, 4 Aug 2022 21:34:59 +0000 (15:34 -0600)]
run/locale.t white space
Karl Williamson [Thu, 4 Aug 2022 21:23:46 +0000 (15:23 -0600)]
locale_threads.t: Use I18N::Langinfo, not POSIX::localeconv()
The former is always present; the latter might not be
Karl Williamson [Tue, 9 Aug 2022 17:59:06 +0000 (11:59 -0600)]
lib/locale.t: Use I18N::Langinfo, not POSIX::localeconv()
Now that Langinfo is ported to every box, it requires less work than
localeconv(), and offers more choices. This commit changes to use it,
and for more info when debugging, gets some additional info from it,
while avoiding some calls when not debugging
Karl Williamson [Fri, 19 Feb 2021 00:26:55 +0000 (17:26 -0700)]
Add Perl_langinfo8()
This is like Perl_langinfo() but additionally returns information about
the UTF-8ness of the returned string.
Karl Williamson [Tue, 2 Mar 2021 01:07:19 +0000 (18:07 -0700)]
locale.c: Add utf8ness return param to my_langinfo_i()
my_langinfo_i() now will additionally return the UTF-8ness of the
returned string.
Karl Williamson [Thu, 18 Feb 2021 23:08:19 +0000 (16:08 -0700)]
Add my_strftime8()
This is like plain my_strftime(), but additionally returns an indication
of the UTF-8ness of the returned string
Karl Williamson [Thu, 28 Jul 2022 22:46:18 +0000 (16:46 -0600)]
locale.c: Add branch prediction, comments
Karl Williamson [Thu, 18 Feb 2021 16:12:37 +0000 (09:12 -0700)]
locale.c: Collapse duplicate logic into one instance
A previous commit move the logic for localeconv() into locale.c. This
commit takes advantage of that to use it instead of repeating the logic.
Notably, this commit removes the inconsistent duplicate logic that had
been used to deal with the Windows broken localeconv() bug.
Karl Williamson [Sat, 10 Apr 2021 16:02:27 +0000 (10:02 -0600)]
locale.c: localeconv() unconditional NUMERIC toggle
It is possible to lockout changing the LC_NUMERIC locale. This is done
in some printf cases where a recursive call could get the radix
character wrong. But localeconv(), which could be called during this
recursion on some platforms, toggles the locale briefly, without
affecting the surrounding calls; so it can do the toggle
unconditionally.
The previous commit merely moved the functionality of localeconv() from
POSIX.xs to locale.c. This commit expands upon that.
Karl Williamson [Thu, 18 Feb 2021 13:27:28 +0000 (06:27 -0700)]
Move POSIX::localeconv() logic to locale.c
The code currently in POSIX.xs is moved to locale.c, and reworked some
to fit in that scheme, and the logic for the workaround for the Windows
broken localeconv() is made more robust.
This is in preparation for the next commit which will use this logic
instead of (imperfectly) duplicating it.
This also creates Perl_localeconv() for direct XS calls of this
functionality.
Karl Williamson [Thu, 18 Feb 2021 03:18:33 +0000 (20:18 -0700)]
locale.c: Add fcn for UTF8ness determination
get_locale_string_utf8ness_i() will determine if the string it is passed
in the locale it is passed is to be treated as UTF-8, or not.
Karl Williamson [Sun, 21 Aug 2022 15:59:01 +0000 (09:59 -0600)]
Add internal typedef locale_utf8ness_t
This will be used in future commits
Karl Williamson [Sun, 21 Aug 2022 15:57:34 +0000 (09:57 -0600)]
Add utf8ness_t typedef
This will be used in future commits
Karl Williamson [Thu, 18 Feb 2021 00:24:33 +0000 (17:24 -0700)]
locale.c: Add is_locale_utf8()
Previous commits have added the infrastructure to be able to determine
if a locale is UTF-8. This will prove useful, and this commit adds
a function to encapsulate this information, and uses it in a couple of
places, with more to come in future commits.
This uses as a final fallback, mbtowc(), supposed to be available in
C99. Future commits will add heuristics when that function isn't
available or is known to be unreliable on a particular system.
Karl Williamson [Mon, 8 Aug 2022 03:12:45 +0000 (21:12 -0600)]
locale.: Need CTYPE to match other category for nl_langinfo
nl_langinfo knows about various components of locales that are supposed
to be defined for every locale, such as a string for a Yes/No response
or the name of a month in a particular language. These are associated
with various locale categories. In the examples cited, the month names
are in the LC_TIME category, and the responses in the LC_MESSAGES one.
But (perhaps because these are text strings), some platforms require the
LC_CTYPE locale to be the same as the other locale. cygwin is an
example. Rather than try to figure out which platform require this, and
which do not, it is a simple matter to just LC_CTYPE at the same time as
the other category
Karl Williamson [Mon, 8 Aug 2022 03:29:02 +0000 (21:29 -0600)]
New signature for static fcn my_langinfo()
This commit changes the calling sequence for my_langinfo to add the
desired locale, and the locale category of the desired item.
This allows the function to be able to return the desired value for any
locale, avoiding some locale changes that would happen until this
commit, and hiding the need for locale changes from outside functions,
though a couple continue to do so to avoid potential multiple changes.
Karl Williamson [Mon, 8 Aug 2022 03:27:18 +0000 (21:27 -0600)]
Add toggle_locale() fcns
These are designed to temporarily switch the locale for a cateogry
around some operation that needs it to be different than the current
one. They will be used in the next commit.
These will eventually replace the more unwieldy
_is_cur_LC_category_utf8() function, which toggles as a side effect
Karl Williamson [Wed, 17 Feb 2021 16:56:06 +0000 (09:56 -0700)]
locale.c: Improve non-nl_langinfo() CODESET calc
Prior to this commit, on non-Windows platforms that don't have a
nl_langinfo() libc function, the code completely punted computation of
the CODESET item. I have not been able to figure out how to do this,
even going to the locale definition files on disk (which may vary
anyway), but we can do a lot better than punting.
This commit adds three checks:
1) If the locale name is C or POSIX, we know the codeset
2) We can detect if a locale is UTF-8. If it is, that is the codeset.
Many modern locales are of this ilk.
3) Failing that, some locales have the codeset appear in the name,
following a dot.
It isn't perfect, but it's a lot better than completely punting.
Karl Williamson [Wed, 17 Feb 2021 14:15:48 +0000 (07:15 -0700)]
locale.c: Add static fcn to analyze locale name codeset
It determines if the name indicates it is UTF-8 or not. There are
several variant spellings in use, and this hides that from the the
callers.
It won't be actually used until the next commit
Karl Williamson [Wed, 17 Feb 2021 13:56:18 +0000 (06:56 -0700)]
locale.c: langinfo: Use Windows fcn to find CODESET
There is a Windows function, available for quite a long time, that will
return the current code page. Use this for the nl_langinfo() CODESET,
as that libc function isn't implemented on Windows.
Karl Williamson [Fri, 12 Aug 2022 20:37:25 +0000 (14:37 -0600)]
locale.c: Make S_save_to_buffer() reentrant
This makes my_langinfo() reentrant by adding parameters specifying where
to store the result.
This prepares for future commits, and fixes some minor bugs for XS
writers, in that the claim was that the buffer in calling
Perl_langinfo() was safe from getting zapped until the next call to it
in the same thread. It turns out there were cases where, because of
internal calls, the buffer did get zapped.
Karl Williamson [Mon, 1 Aug 2022 15:53:45 +0000 (09:53 -0600)]
embed.fnc: Also check for NL_LANGINFO_L
The preprocessor directives were only flooking for plain nl_langinfo().
It's quite unlikely that a platform will have the '_l' version without
also having the plain one. But this makes sure.
Karl Williamson [Mon, 1 Aug 2022 12:45:47 +0000 (06:45 -0600)]
locale: make PL_langinfo_buf const *
The previous commit allows this change to be made.
Karl Williamson [Wed, 17 Feb 2021 16:36:15 +0000 (09:36 -0700)]
locale.c: Use a scratch buf; instead of reusing old
This is in preparation for the next commit
Karl Williamson [Sun, 31 Jul 2022 14:56:49 +0000 (08:56 -0600)]
locale.c: Fix Windows bug with broken localeconv()
localeconv() was broken on Windows until VS 2015. As a workaround, this
was using my_snprintf() to find what the decimal point character is,
trying to avoid our workaround for localeconv(), which has a (slight)
chance of a race condition.
The problem is that my_snprintf() might not end up calling snprintf at
all; I didn't trace all possibilities in Windows. So it doesn't make
for a reliable sentinel.
This commit now specifically uses libc snprintf(), and if it fails, drops
down to try localeconv().
It also changes things so that if localeconv() is not present at all or
usable on the platform, to use this snprintf method.
Karl Williamson [Thu, 28 Jul 2022 21:19:06 +0000 (15:19 -0600)]
locale.c: Don't read off buffer end
In some configurations, under the exact set of input it would have been
possible to read past the buffer end. This commit adds a conditional to
prevent that.
Karl Williamson [Tue, 16 Feb 2021 16:22:24 +0000 (09:22 -0700)]
locale.c: S_save_to_buffer; Rmv no longer used param
Previous commits have gotten rid of this parameter.
Karl Williamson [Mon, 29 Mar 2021 01:33:27 +0000 (19:33 -0600)]
S_save_to_buffer() allow ignoring return value
Future commits will want to use this, while discarding the return value.
Karl Williamson [Tue, 16 Feb 2021 17:05:35 +0000 (10:05 -0700)]
locale.c: Don't ask S_save_to_buffer() to be inlined
It's too complicated to really be inlined, and the compiler can figure
things out itself given it is a static function
Karl Williamson [Tue, 16 Feb 2021 16:43:38 +0000 (09:43 -0700)]
locale.c: Don't add CP to Windows code page names
The actual name appears to be just the number for purposes of
nl_langinfo()-ish things.
Karl Williamson [Tue, 16 Feb 2021 16:01:58 +0000 (09:01 -0700)]
locale.c: Fix currency symbol derivation
On platforms without nl_langinfo(), we derive the currency symbol from
localeconv(). The symbol must be tweaked to conform to nl_langinfo()
standards. Prior to this commit, it guessed at how to tweak a rare
circumstance. I found evidence this guess was wrong, so looked around,
and copied the way cygwin does it.
This also no longer returns just an empty string in certain cases.
nl_langinfo() itself doesn't, so conform to that.
Karl Williamson [Tue, 16 Feb 2021 12:54:14 +0000 (05:54 -0700)]
locale.c: Rmv redundant cBOOL()
strEQ and && already return booleans
Karl Williamson [Tue, 16 Feb 2021 11:52:10 +0000 (04:52 -0700)]
locale.c: Use typedef to simplify
This allows some preprocessor conditionals to be removed
Karl Williamson [Tue, 16 Feb 2021 11:36:24 +0000 (04:36 -0700)]
locale.c: Extend S_save_to_buffer()
This will allow it to be used in situations where the buffer it controls
is single use, and we don't need to keep track of the size for future
calls.
Karl Williamson [Tue, 16 Feb 2021 11:31:11 +0000 (04:31 -0700)]
locale.c: Shorten my_nl_langinfo() to my_langinfo()
The extra syllable(s) are unnecessary noise
Karl Williamson [Wed, 27 Jul 2022 19:16:17 +0000 (13:16 -0600)]
locale.c: White-space only
Align with previous commit and properly indent some preprocessor
directives
Karl Williamson [Mon, 1 Mar 2021 13:11:40 +0000 (06:11 -0700)]
locale.c: Rmv reimplementation of my_strftime()
Prior to this commit, there was a near duplicate copy of the code from
util.c that implements my_strftime(). This was done because the util.c
version zaps the wday field, which made it incompatible.
But it dawned on me that if the arbitrary date we use to do our
calculations were such that it was for a year in which the wday field
gets zapped to the value we want it to be, then the util.c version
automatically works. This happens in years when January 1 falls on a
Sunday.
Karl Williamson [Mon, 1 Mar 2021 13:05:26 +0000 (06:05 -0700)]
locale.c: Return defaults for uncomputable langinfo items
Return the values from the C locale for nl_langinfo() items that aren't
computable on this platform. If the platform has nl_langinfo(), then
all of them are computable, but if not, some can't be computed, and
others can be, but only if there are alternative methods available on
the platform.
As part of this commit, S_my_nl_langinfo() and S_save_to_buffer() are no
longer used when USE_LOCALE is not defined, so don't compile them.
Karl Williamson [Tue, 16 Feb 2021 03:04:30 +0000 (20:04 -0700)]
locale.c: Add two #defines
This makes sure that we handle having any variant of nl_langinfo() or
localeconv().
Karl Williamson [Tue, 16 Feb 2021 02:46:57 +0000 (19:46 -0700)]
locale.c: Make statics of repeated string constants
These strings are (or soon will be) used in multiple places; so have
just one definition for them.
Karl Williamson [Tue, 16 Feb 2021 01:55:22 +0000 (18:55 -0700)]
locale.c: Reorder cases in a switch
This moves handling the CODESET to the end, as future commits will make
its handling more complicated. The cases are now ordered so the
simplest (based on the direction of future commits) are first
Karl Williamson [Sat, 13 Aug 2022 11:35:25 +0000 (05:35 -0600)]
locale.c: Mollify clang
It claims these could be used uninitialized. I don't think it can, but
change to quiet the warning
Karl Williamson [Tue, 16 Feb 2021 04:13:35 +0000 (21:13 -0700)]
Move code for mbr?towc() from POSIX.xs to locale.c
This avoids duplicated logic.
Karl Williamson [Sun, 14 Feb 2021 02:07:40 +0000 (19:07 -0700)]
locale.c: Separate out two Win fcns from a larger one
This makes the larger one easier to understand, and prepares for
possible independent calls to the two, which are potentially useful on
their own.
Karl Williamson [Wed, 17 Feb 2021 03:40:50 +0000 (20:40 -0700)]
locale.c: Don't change locale if already there
Changing the locale is cheap for some categories, but expensive for
others. Changing LC_COLLATE is most expensive, requiring recalculation
of the collation transformation mapping.
This commit checks that we aren't already in the desired locale before
changing locales. and does nothing if no change is needed.
Karl Williamson [Tue, 26 Jul 2022 21:51:22 +0000 (15:51 -0600)]
locale.c: Silence some compiler warnings if no LC_ALL
Karl Williamson [Sun, 7 Aug 2022 13:20:43 +0000 (07:20 -0600)]
Initialize PL_numeric_name, PL_collation_name
Having these initialized to the C locale aoids some otherwise required
conditionals.
Karl Williamson [Wed, 10 Aug 2022 21:36:52 +0000 (15:36 -0600)]
locale.c: Refactor a strerror implementation
The previous commit made it clear to me that this implementation of
strerror() could be simplified. (There are several implementations
depending on what libc functions are available on the platform.)
Karl Williamson [Fri, 5 Aug 2022 12:51:19 +0000 (06:51 -0600)]
locale.c: querylocale return mortalized copy
It is too easy to forget to savepv() the return of these macros, leading
to hard-to-diagnose bugs. Head those off at the pass by always making a
copy that gets freed by the system.
James E Keenan [Sun, 21 Aug 2022 17:58:13 +0000 (17:58 +0000)]
Correct .mailmap entry for
a08a101a45
Bram [Fri, 12 Aug 2022 21:26:39 +0000 (23:26 +0200)]
t/io/eintr.t: only show diag message on failure
Also improve the error message by capturing the exception/error
returned by the `fnctl` call.
Bram [Fri, 12 Aug 2022 13:40:19 +0000 (15:40 +0200)]
t/io/eintr.t: Add `diag` when F_GETPIPE_SZ failed
When the fcntl call to get the size of the pipe buffer failed then
it would silently fallback to 0xfffff.
-> Make this fallback no longer silent and add a diagnostic message
when it happens.