This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Karl Williamson [Tue, 9 Jan 2024 19:21:41 +0000 (12:21 -0700)]
locale.c: Indent a couple statements
This will allow them to vertically align with code added in a future
commit
Karl Williamson [Wed, 10 Jan 2024 16:24:47 +0000 (09:24 -0700)]
locale.c: Refactor use of Perl_form
This commit creates #define's for the format to use in that function
call. This will make a future commit easier.
Karl Williamson [Wed, 10 Jan 2024 04:03:32 +0000 (21:03 -0700)]
locale.c: Move some langinfo emulation code
Some paths through this section of code don't require the result to be
saved to a buffer. Move the code so to avoid that unnecessary work.
The result is indented an extra amount to prepare for a future commit.
Karl Williamson [Tue, 9 Jan 2024 14:12:08 +0000 (07:12 -0700)]
S_my_langinfo_i: Always return both a buffer and a value
This function returns a value, and also copies that value into a buffer.
Certain rare Configurations failed to fill in the buffer in some
circumstances.
This commit fixes bugs on those Configurations, and is the first step in
the next several commits in not doing unnecessary copies, which will
obsolete this commit.
Дилян Палаузов [Sat, 13 Jan 2024 11:35:51 +0000 (12:35 +0100)]
t/re/charset: spell the locale when setlocale() fails
Closes https://github.com/Perl/perl5/issues/21805
Karl Williamson [Sat, 13 Jan 2024 20:47:12 +0000 (13:47 -0700)]
dilyanpalauzov is now a Perl author
rwp0 [Thu, 11 Jan 2024 01:08:32 +0000 (02:08 +0100)]
[doc] Minor spelling issues in C file comments
Correct duplicate words and convert links
to HTTPS where supported
Max Maischein [Fri, 12 Jan 2024 09:22:14 +0000 (10:22 +0100)]
Update Exporter version number to 0.78
This was released to CPAN with changes from the blead Makefile.PL
( via
04528e741d17821b6c5e65bfb24935e912d182d3 )
under a new version number but the version number bump was not
done in bleadperl.
Found via
./perl Porting/core-cpan-diff -d Exporter
after editing the version number in Porting/Maintainers.pl to 5.78.
Applied via
./perl Porting/core-cpan-diff -d Exporter -r | patch -p0
Max Maischein [Fri, 12 Jan 2024 09:15:11 +0000 (10:15 +0100)]
Correct Locale::Maketext version number in Maintainers.pl to 1.33
... also update the Changes file to the 1.33 version. No other differences
found via
./perl Porting/core-cpan-diff -d Locale::Maketext
Diff applied via
./perl Porting/core-cpan-diff -d Locale::Maketext -r | patch -p0
Max Maischein [Fri, 12 Jan 2024 09:08:09 +0000 (10:08 +0100)]
Update IO version number in Porting/Maintainers.pl
The code is 1.55 but the version number in Maintainers.pl was at 1.51
Checking that the stuff is identical was done with
./perl Porting/core-cpan-diff -d IO
Max Maischein [Thu, 11 Jan 2024 19:24:37 +0000 (20:24 +0100)]
cpan/Math-BigInt-FastCalc - Update to version 0.5018
0.5018 2024-01-06
* Sync test files with Math-BigInt.
Max Maischein [Thu, 11 Jan 2024 19:23:34 +0000 (20:23 +0100)]
cpan/Math-BigInt - Update to version 2.003002
2.003002 2024-01-05
* Improved interaction between Math::BigInt and the backend libraries.
* Much faster versions of _ilog2() and _clog2() implemented in the
Math::BigInt::Calc backend library. This should speed up bilog2() and
bclog2() in Math::BigInt when "Calc" is used as the backend liberary.
Max Maischein [Tue, 9 Jan 2024 19:34:44 +0000 (20:34 +0100)]
cpan/Pod-Checker - Update to version 1.76
Version 1.76
+ CPAN#149267: "OK" message should go to STDOUT, not STDERR
+ CPAN#150660: podchecker -quiet does not print "OK" message
21-May-2022 Marek Rouchal <marekr@cpan.org>
Karl Williamson [Wed, 10 Jan 2024 13:45:49 +0000 (06:45 -0700)]
locale.c: Change S_emulate_langinfo formal arg type
The type of this parameter changed recently from an enum to an int,
and embed.fnc was updated, but not the actual function definition.
No compilers except the OpenBSD one complained so far.
Fixes GH #21811
Dagfinn Ilmari Mannsåker [Wed, 10 Jan 2024 17:58:58 +0000 (17:58 +0000)]
Remove full stops from POD headings
Headings are not sentences and should not have a trailing full stop
unless the last word is an abbreviation or as part of an ellipsis.
Karl Williamson [Wed, 6 Dec 2023 13:44:36 +0000 (06:44 -0700)]
locale.c Use macro to avoid conditionals
We know at compile time what these conditionals always evaluate to. Use
a macro to replace the confitional with nothing when its always going to
be true. (The section of code doesn't even get compiled when its going
to always be false.)
Karl Williamson [Tue, 9 Jan 2024 19:10:17 +0000 (12:10 -0700)]
Fix blead not compiling with -DNO_LOCALE
This recently introduced failure was due to yet another problem with my
doing a 'rebase -i'. embed.fnc nowadays gets re-sorted as part of the
build process, and I need to manually verify each time that the new
ordering didn't screw anything up. Anyway, this function declaration
got placed in the wrong bucket of #ifdefs in embed.fnc. This commit
fixes that.
Tony Cook [Wed, 10 Jan 2024 22:41:19 +0000 (09:41 +1100)]
Tony Cook [Wed, 10 Jan 2024 00:48:27 +0000 (11:48 +1100)]
XSUB.h: use Stack_off_t for AX and items
I hadn't expected code to be taking pointers or references to AX,
which turned out to be wrong, so make them Stack_off_t.
This allows XS::Framework or similar code to build with a default
build of perl, but it will still fail to build if perl is built
with -DPERL_STACK_OFFSET_SSIZET, which can only be fixed by updating
XS::Framework to use Stack_off_t itself.
Fixes #21782
Karl Williamson [Sun, 31 Dec 2023 14:41:06 +0000 (07:41 -0700)]
perlapi: Rmv obsolete advice regarding Perl_langinfo
The header files have long been adjusted so that the programmer doesn't
have to concern themselves with this bookkeeping.
Karl Williamson [Sat, 30 Dec 2023 15:09:58 +0000 (08:09 -0700)]
locale.c: Use macros to avoid some #ifdef's
This #defines these macros once depending on the Configuration to be
either no-ops, or the appropriate expansion. Then in the code just the
macros instead of #ifdef's.
Max Maischein [Tue, 9 Jan 2024 18:06:00 +0000 (19:06 +0100)]
Renée and Max swapped their release dates/versions
David Mitchell [Tue, 9 Jan 2024 13:10:10 +0000 (13:10 +0000)]
[MERGE] handle/optimise some ops in void context
This branch:
Fixes a couple of ops which weren't getting their context set correctly.
This doesn't make any change to the correct operation of those ops (they
don't change their behaviour based on context), but could allow for
future optimisations.
Skips a couple of portmanteau OP optimisations ({padv,elemfast}_store) in
non-void context. This allows those two be ops to be slightly faster in void
context (which is overwhelmingly how they are likely to be used(*)), and
falls back to the old separate padsv/sassign op pair otherwise.
Updates a few other ops to handle void context slightly more efficiently.
(*) Tony Cook profiled the test suite: 83.7% of calls to
pp_padsv_store()s were in void context and 99.9% of
aelemfastlex_store()s.
David Mitchell [Thu, 21 Dec 2023 19:22:45 +0000 (19:22 +0000)]
pp_push, pp_unshift: don't push TARG in void cxt
David Mitchell [Sat, 16 Dec 2023 18:53:41 +0000 (18:53 +0000)]
pp_undef(): optimise for void return
This function is typically called in void context; don't push a result
in void context which will will only be immediately popped again by the
following nextstate or unstack op. On PERL_RC_STACK builds, this
also avoids ++ing and --ing the return value's ref count.
In unknown context it just assumes non-void, rather than incurring the
cost of determining the caller's context.
David Mitchell [Sat, 16 Dec 2023 18:20:50 +0000 (18:20 +0000)]
pp_print() optimise for void return
This function is often called in void context; don't push a result
in void context which will will only be immediately popped again by the
following nextstate or unstack op. On PERL_RC_STACK builds, this
also avoids ++ing and --ing the return value's ref count.
In unknown context it just assumes non-void, rather than incurring the
cost of determining the caller's context.
David Mitchell [Fri, 15 Dec 2023 23:32:08 +0000 (23:32 +0000)]
pp_sassign: make return more efficient.
Currently scalar assign effectively always pops the original two
arguments, then pushes the result.
This commit makes it so that
1) in void context (the most usual), it just pops 2 and skips pushing 1.
2) In scalar context, it still logically pops 2 and pushes 1, but since
the return value is also the original left arg, just shuffle that
argument around on the stack rather than messing around with its
reference count in the meantime. The right arg is still freed.
To make it so that *PL_stack_sp remains equal to left, a couple of
non-hot codepaths which break this rule now update the stack to keep
things consistent.
David Mitchell [Fri, 15 Dec 2023 22:21:12 +0000 (22:21 +0000)]
OP_AELEMFASTLEX_STORE: only in void context
For the optimisation which converts $lex[N] = expr into an
OP_AELEMFASTLEX_STORE op, only optimise if the scalar assign is in VOID
context.
(See the previous commit for a very similar change to OP_PADSV_STORE.)
This allows us to stop pp_aelemfastlex_store() from uselessly pushing the
result onto the stack, only to be immediately popped again by the
following nextstate or unstack op. This becomes more important on
PERL_RC_STACK builds, as each push or pop involves manipulating the SV's
reference count.
I'm working on the assumption that scalar/list/unknown cxt lexical
array element assigns are rare enough that not optimising them is less
of a loss than optimising the void case. So:
$lex[0] = ....; # void: now even faster
$other = $lex[0] = ...; # scalar: now slow again
foo($lex[0] = ..., ....); # list: now slow again
sub {
....
$lex[0] = ...; # unknown: now slow again
}
David Mitchell [Fri, 15 Dec 2023 21:55:04 +0000 (21:55 +0000)]
OP_PADSV_STORE: only in void context
For the optimisation which converts $lex = expr into an OP_PADSV_STORE
op, only optimise if the scalar assign is in VOID context.
This allows us to stop pp_padsv_store() from uselessly pushing the
result onto the stack, only to be immediately popped again by the
following nextstate or unstack op. This becomes more important on
PERL_RC_STACK builds, as each push or pop involves manipulating the SV's
reference count.
I'm working on the assumption that scalar/list/unknown cxt lexical
assigns are rare enough that not optimising them is less of a loss than
optimising the void case. So:
$lex = ....; # void: now even faster
$other = $lex = ...; # scalar: now slow again
foo($lex = ..., ....); # list: now slow again
sub {
....
$lex = ...; # unknown: now slow again
}
David Mitchell [Mon, 18 Dec 2023 21:50:24 +0000 (21:50 +0000)]
Give OPpTARGET_MY ops real context
Perl has an optimisation whereby an op which returns the RHS of a scalar
assignment to a lexical, and which would normally store its result in
a PADTMP, instead skips the following PADSV and SASSIGN ops and assigns
to the lexical directly, instead of to the PADTMP. For example in
$lex = $a + $b;
the ops (in execution-order) would be changed from
add[t5] sK/2
gvsv[*lex] s
sassign vKS/2
nextstate(main 2 -e:1) v:{
to
add[$lex:1,2] sK/TARGMY,2
nextstate(main 2 -e:1) v:{
However, note that although that the add op is now essentially called in
void context, it is still marked as being in scalar context. This commit
changes it to be be marked as void, i.e.
add[$lex:1,2] vK/TARGMY,2
The main reason for this is to allow for future optimisations in
functions like pp_add(), which will be able to skip pushing the result
onto the stack in in void context. It just so happens that scalar
assignments to lexical vars are typically in void context.
However, since this is a visible change from the perspective of modules
which examine ops or optrees, I'll leave doing any optimisations until
later, in case this commit needs to be reverted.
The main things this commit had to fix up following the change were:
- still call overload methods in scalar context, even though the op is
now marked as void;
- not issuing "useless use of add in void context" style warnings on
such ops;
- making pp_push and pp_unshift still set TARG in void context
No matter whether the op's context is marked as scalar or as void, some
parts of perl will misunderstand, and will need to be special-cased
(since the op is really both, depending on your perspective). So this
commit changes the burden of the special-casing code to be in the
non-hot code paths, like during complication or when calling out to an
overload method.
David Mitchell [Sat, 16 Dec 2023 14:23:27 +0000 (14:23 +0000)]
set context in 'state' expressions
The ops associated with a state variable declaration and initial
assignment weren't getting their context (GIMME) assigned.
Background:
state $x = ...;
gets compiled to something similar to
if (first_time)
$x = ...;
else
$x;
Except that the 'if' is performed by an OP_ONCE logop, which checks and
updates a flag in the pad, and branches to op_next or op_other as
appropriate.
During compilation, the context of the state expression wasn't being
passed on to the children of the OP_ONCE. So the assignment (optimised
into a padsv_store) and the padsv were left as UNKNOWN rather than VOID
or SCALAR as appropriate, as in these two examples:
state $x = 1; # should be void
$y = (state $x = 1); # should be scalar
This commit fixes that. Note that at the moment it makes no practical
difference, since the padsv/padsv_store ops don't currently change
their behaviour based on context, but that might change.
David Mitchell [Sat, 16 Dec 2023 13:51:09 +0000 (13:51 +0000)]
add Concise tests for state var assignment
Add tests for
state $x = 1;
my $y = state $x = 1;
to check what context is allocated to the various ops. At the moment it
is actually wrong in places, and this commit captures that wrongness.
The next commit will fix this, and those diffs to the tests added in this
commit will help make it clear what has changed.
Karl Williamson [Thu, 28 Dec 2023 19:09:57 +0000 (12:09 -0700)]
locale.c: Add clarifying comment
Karl Williamson [Thu, 28 Dec 2023 19:50:04 +0000 (12:50 -0700)]
locale.c: Can now call my_langinfo_i() in all circumstances
Prior to this commit, the caller had to know if the category in question
was being ignored in this Configuration. Now, that is transparent.
Karl Williamson [Thu, 28 Dec 2023 16:49:49 +0000 (09:49 -0700)]
locale.c: Extract code into a separate function
This new function does not have to be compiled in all Configurations,
and placing it separately removes some #ifdefs, and allows its absence
to be more transparent from its callers.
Karl Williamson [Tue, 2 Jan 2024 00:41:37 +0000 (17:41 -0700)]
locale.c: langinfo: Better handle unknown input item
On systems with <langinfo.h>, the 'item' parameter here must be a member
of an enum. If it isn't something we recognize, it's a real problem.
On other systems any int can be passed, and it could just be that the
caller got a wrong value. For these, we don't know if it is a perl
error or a user error, so just set EINVAL and return failure
Karl Williamson [Thu, 28 Dec 2023 16:24:51 +0000 (09:24 -0700)]
locale.c: No strftime() implies empty formats
When returning an nl_langinfo() format when we are emulating
nl_langinfo() when there is no strftime() on the system, return an
empty format. A non-empty return is useless.
Karl Williamson [Thu, 28 Dec 2023 16:23:57 +0000 (09:23 -0700)]
locale.c: Consolidate handling of remaining TIME items
Without nl_langinfo(), some Configurations handled these nl_langinfo()
items associated with the LC_TIME category in Perl_langinfo8(), and some
Configurations in emulate_langinfo(). This commit consolidates all the
handling to emulate_langinfo().
Karl Williamson [Thu, 28 Dec 2023 16:16:59 +0000 (09:16 -0700)]
locale.c: Remove redundant handling of the AM_PM format
This format is identical whether or not LC_TIME is usable on this
platform or not. Previously, it was actually handled in two functions,
emulate_langinfo() when LC_TIME is usable, and Perl_langinfo8() when
not. Previous commits have enabled the format handling in
emulate_langinfo() to be returned instead of having separate cases,
depending on LC_TIME's availability. That means the handling of this
format in Perl_langinfo8() is redundant and this commit removes it.
Karl Williamson [Thu, 28 Dec 2023 15:36:52 +0000 (08:36 -0700)]
locale.c: Consolidate handling of ERA format langinfo items
Without nl_langinfo(), some Configurations handled these nl_langinfo()
items associated with the LC_TIME category in Perl_langinfo8(), and some
Configurations in emulate_langinfo(). This commit consolidates all the
handling to emulate_langinfo().
Karl Williamson [Thu, 28 Dec 2023 15:58:22 +0000 (08:58 -0700)]
locale.c: Consolidate handling of ALT_DIGITS langinfo item
Without nl_langinfo(), some Configurations handled this nl_langinfo()
item associated with the LC_TIME category in Perl_langinfo8(), and some
Configurations in emulate_langinfo(). This commit consolidates all the
handling to emulate_langinfo().
Karl Williamson [Thu, 28 Dec 2023 07:04:37 +0000 (00:04 -0700)]
locale.c: Move handling of ERA langinfo item
This commit moves the handling of the ERA langinfo item to
emulate_langinfo() away from Perl_langinfo8().
Karl Williamson [Thu, 28 Dec 2023 06:20:06 +0000 (23:20 -0700)]
locale.c: Consolidate handling of AM/PM langinfo items
Without nl_langinfo(), some Configurations handled these nl_langinfo()
items associated with the LC_TIME category in Perl_langinfo8(), and some
Configurations in emulate_langinfo(). This commit consolidates all the
handling to emulate_langinfo().
Karl Williamson [Thu, 28 Dec 2023 06:13:41 +0000 (23:13 -0700)]
locale.c: Consolidate handling of MON langinfo items
Without nl_langinfo(), some Configurations handled these nl_langinfo()
items associated with the LC_TIME category in Perl_langinfo8(), and some
Configurations in emulate_langinfo(). This commit consolidates all the
handling to emulate_langinfo().
Karl Williamson [Thu, 28 Dec 2023 04:45:52 +0000 (21:45 -0700)]
locale.c: Consolidate handling of ABMON langinfo items
Without nl_langinfo(), some Configurations handled these nl_langinfo()
items associated with the LC_TIME category in Perl_langinfo8(), and some
Configurations in emulate_langinfo(). This commit consolidates all the
handling to emulate_langinfo().
Karl Williamson [Thu, 28 Dec 2023 04:36:52 +0000 (21:36 -0700)]
locale.c: Consolidate handling of DAY langinfo items
Without nl_langinfo(), some Configurations handled these nl_langinfo()
items associated with the LC_TIME category in Perl_langinfo8(), and some
Configurations in emulate_langinfo(). This commit consolidates all the
handling to emulate_langinfo().
Karl Williamson [Tue, 2 Jan 2024 01:53:20 +0000 (18:53 -0700)]
locale.c: Consolidate handling of ABDAY langinfo items
Without nl_langinfo(), some Configurations handled these nl_langinfo()
items associated with the LC_TIME category in Perl_langinfo8(), and some
Configurations in emulate_langinfo(). This commit consolidates all the
handling to emulate_langinfo().
Karl Williamson [Thu, 28 Dec 2023 00:26:07 +0000 (17:26 -0700)]
locale.c: Move handling of LC_MESSAGES langinfo items
This commit moves the handling of langinfo items associated with the
LC_MESSAGES category to emulate_langinfo() when nl_langinfo() is not
available on the platform away from Perl_langinfo8().
This is a step in the process of handling all such items in
emulate_langinfo().
Karl Williamson [Wed, 27 Dec 2023 23:13:10 +0000 (16:13 -0700)]
locale.c: Consolidate handling of some langinfo items
Without nl_langinfo(), some Configurations handled nl_langinfo() items
associated with the LC_CTYPE, LC_MONETARY, and LC_NUMERIC categories in
Perl_langinfo8(), and some Configurations in emulate_langinfo(). This
commit consolidates all the handling to emulate_langinfo().
This commit removes the only use of a macro, which is hence also
removed.
Karl Williamson [Wed, 27 Dec 2023 22:33:06 +0000 (15:33 -0700)]
locale.c: Move check for unknown langinfo items
This moves this check into the lowest level function, so that no
duplication is needed.
Karl Williamson [Tue, 26 Dec 2023 14:13:55 +0000 (07:13 -0700)]
locale.c: Handle some ignored categories in langinfo emulation
It is possible to Configure perl to not allow any or all of the locale
categories on the platform to be changed away from the C locale,
generally due to libc limitations. OpenBSD, for example, keeps every
category but one in C; so perl shouldn't even try to change any of
those.
On platforms without the libc nl_langinfo() function, perl emulates it.
Prior to this commit, the emulation did not properly handle the case
of categories needing to stay in C. This did not lead to bugs, because
external to this file, any such calls were intercepted by
Perl_langinfo8(), and internally, we just didn't call it under those
circumstances.
But this leads to some awkwardness, and it is more maintainable to have
the handling in one place; which is this low-level emulation.
This commit causes the emulation to handle LC_CTYPE, LC_NUMERIC, and
LC_MONETARY needing to stay in the C locale. Future commits will add
the other categories, and remove the redundant checks in
Perl_langinfo8().
Karl Williamson [Mon, 8 Jan 2024 17:55:21 +0000 (10:55 -0700)]
locale.c: Effectively white-space, comments only
The previous few commits have added/removed code blocks and #if
directives. Adjust white space accordingly
This also adds explanatary comments
Karl Williamson [Mon, 8 Jan 2024 17:49:03 +0000 (10:49 -0700)]
locale.c: Compile S_emulate_langinfo() under more Configurations
This is in preparation for some code to be consolidated into just this
one function.
Karl Williamson [Mon, 1 Jan 2024 18:48:13 +0000 (11:48 -0700)]
locale.c: Always compile save__to_buffer()
The next commit will want to use it in more Configurations, and a future
commit in all Configurations.
Karl Williamson [Wed, 27 Dec 2023 22:10:42 +0000 (15:10 -0700)]
locale.c: Tighten circumstances some code is generated
This code need not be compiled except under the circumstances defined in
this commit.
Karl Williamson [Tue, 26 Dec 2023 13:09:49 +0000 (06:09 -0700)]
locale.c: Move GCC DIAG IGNORE
The previous commit changed two nested switch() statements. The inner
one turned off implicit switch() case statement fallthrough warnings.
This commit applies that to the entire outer switch().
Karl Williamson [Tue, 26 Dec 2023 13:08:51 +0000 (06:08 -0700)]
locale.c: Refactor two nested switch() statements
This slightly refactors these so that many case labels are no longer
repeated. It does this by making the nested one the default: of the
outer one, and the outer's previous default: becomes the inner's
default (which formerly was useless, so panicked).
Karl Williamson [Thu, 21 Dec 2023 17:07:15 +0000 (10:07 -0700)]
locale.c: Split a function in two
Prior to this commit, the function was #ifdef'd so that one or the other
part would compile. But future commits will cause them both to need to
be compiled in some circumstances, with the first part potentially
calling the second part.
Note that the 2nd part doesn't need the parameter 'cat_index' that the
first part does.
Karl Williamson [Thu, 21 Dec 2023 15:20:43 +0000 (08:20 -0700)]
locale.c: Recompute variable in 2nd part of function
This is in preparation for the next commit splitting the function in
two, and the current 2nd part will not need or have this variable passed
in as an argument.
Karl Williamson [Thu, 14 Dec 2023 17:48:35 +0000 (10:48 -0700)]
locale.c: Move some functions to earlier in the file
These are now more logically placed.
Karl Williamson [Mon, 4 Dec 2023 15:27:26 +0000 (08:27 -0700)]
locale.c: Remove PERL_UNUSED_ARG
'which_mask' is now always used
Karl Williamson [Wed, 3 Jan 2024 17:04:19 +0000 (10:04 -0700)]
perl.h: #include locale_table.h even if NO_LOCALE
It defines some symbols which help in this circumstance, and others
which would otherwise have to be #ifdef'd against in order to compile.
Karl Williamson [Wed, 3 Jan 2024 16:50:02 +0000 (09:50 -0700)]
-DNO_LOCALE implies -DNO_LOCALE_COLLATE
This is a follow-on to the following commit,
which somehow missed this category.
commit
08123d87ea3adde7ae36a205b3262804532efbed
Author: Karl Williamson <khw@cpan.org>
Date: Tue Dec 19 15:00:33 2023 -0700
-DNO_LOCALE implies -DNO_LOCALE_CTYPE, etc.
If we aren't to pay attention to locales in general; we certainly
shouldn't be paying attention to individual locale categories.
This commit allows for cleaner #ifdefs
Karl Williamson [Wed, 3 Jan 2024 16:55:48 +0000 (09:55 -0700)]
locale.c: Skip a debug check if no LC_CTYPE
This check is done only when DEBUGGING is active; but it makes no sense
if LC_CTYPE can only be the C locale.
Karl Williamson [Wed, 3 Jan 2024 17:06:49 +0000 (10:06 -0700)]
lib/locale.t: Display thousands separator under debug
This allows someone to run this test, which exercises, every locale on
the system, and see at a glance what this value is. There is special
code to handle this value, and it is helpful to see what the system has
for it. This does not affect normal operation; only when the test is
run with a debugging environment variable set.
Tony Cook [Wed, 20 Dec 2023 03:58:05 +0000 (14:58 +1100)]
allow some basic infrastructure to load with -Dusedefaultstrict
The changes to t/test.pl appear to be real bugs.
This allows `make test_harness` to run, but many tests will still
fail under -Dusedefaultstrict
This addresses #21732 but does not fix it. I'm unsure how
supported that build option is.
Tony Cook [Thu, 28 Dec 2023 10:08:41 +0000 (21:08 +1100)]
pp_backtick: remove RC_STACK wrapper and use the new APIs
Karl Williamson [Tue, 26 Dec 2023 13:36:05 +0000 (06:36 -0700)]
locale.c: Adjust some #if, #else
This removes the need for a FALLTHROUGH comment.
Karl Williamson [Tue, 26 Dec 2023 13:34:01 +0000 (06:34 -0700)]
locale.c: Reorder two more case: statements
This is in preparation for future commits where the new order will make
more sense than the current one.
Karl Williamson [Fri, 29 Dec 2023 20:13:51 +0000 (13:13 -0700)]
locale.c: Reorder cases in a switch()
The CODESET case was kept last because it had by far the largest amount
of code of any of the cases. But the majority of it has now been
shunted into a separate function. There are, in contrast, many LC_TIME
related case statements, and future commits will add significantly to
the amount of code implementing them; therefore they are better placed
last in the switch().
David Mitchell [Wed, 3 Jan 2024 13:39:21 +0000 (13:39 +0000)]
perlguts: fix ref count in tie() example
Spotted by Marcel Telka.
David Mitchell [Wed, 3 Jan 2024 12:57:09 +0000 (12:57 +0000)]
[MERGE] PERL_RC_STACK: add _IMM, unwrap, fix leaks
- unwrap the ops pp_prtf, pp_sprintf, pp_subst, pp_substcont, pp_bless
- fix a few PERL_RC_STACK-related leaks I noticed along the way
- add some _IMM variants of the rpp_ functions when the SV being pushed
is an immortal
- make wide use of the _IMM and _NN rpp_ function variants
- optimise two-item stack pops and replaces
David Mitchell [Sat, 16 Dec 2023 16:50:41 +0000 (16:50 +0000)]
make rpp_popfree_2_NN() use rpp_free_2_()
Like for Perl_rpp_replace_2_1*, this means that freeing one or both SVs
being popped is done by a single function call.
David Mitchell [Wed, 13 Dec 2023 14:21:15 +0000 (14:21 +0000)]
Optimise rpp_replace_2_{1,IMM}_NN()
These two static functions are used in a lot of pp functions.
This commit does two main things. First, it makes the size of the inline
function smaller, and second, it uses a single branch (rather than two)
to decide whether either of the two SVs being popped need to be freed.
In detail: apart from the actual stack manipulation itself, the other
main action of these two functions:
rpp_replace_2_1_NN()
rpp_replace_2_IMM_NN()
is to do the equivalent of
SvREFCNT_dec_NN(PL_stack[-1]);
SvREFCNT_dec_NN(PL_stack[-0]);
Now, SvREFCNT_dec_NN() is an inline function which expands to
something like:
U32 rc = SvREFCNT(sv);
if (LIKELY(rc > 1))
SvREFCNT(sv) = rc - 1;
else
Perl_sv_free2(aTHX_ sv, rc);
With this expanded *twice* within the body of rpp_replace_2_1_NN(),
there are two branch tests and two function calls - all of which are
expanded inline into the bodies of all 50+ pp functions which use it.
This commit makes this be changed to something equivalent to
U32 rc1 = SvREFCNT(sv1);
U32 rc2 = SvREFCNT(sv2);
if (LIKELY(rc1 > 1 && rc2 > 1)) {
SvREFCNT(sv1) = rc1 - 1;
SvREFCNT(sv2) = rc2 - 1;
}
else
Perl_rpp_free_2_(aTHX_ sv1, sv2, rc1, rc2);
Where Perl_rpp_free_2_() does the hard work of deciding whether either
or both SVs actually need freeing.
This approach assumes that, most of the time, rpp_replace_2_1_NN() won't
actually be freeing either of the two old args on the stack, because
often they are likely to be PADTMPs or lexicals or array elements or
or immortals or whatever, which have a longer lifetime. I.e. this
commit is betting that
$a + ($b * $c); # RHS of '+' is a PADTMP
is more common than
$a + f(); # RHS of '+' is a temporary SV with RC==1
David Mitchell [Tue, 12 Dec 2023 11:58:29 +0000 (11:58 +0000)]
make RC-stack-aware: unwrap pp_bless()
Remove the temporary wrapper from pp_bless().
David Mitchell [Tue, 12 Dec 2023 11:49:45 +0000 (11:49 +0000)]
pp_subst(): consolidate some duplicated code
There are a couple of places in pp_subst() which create a new mortal to
return the iteration count. Consolidate them into a single code block at
the end of the function.
David Mitchell [Tue, 12 Dec 2023 00:31:31 +0000 (00:31 +0000)]
make RC-stack-aware: unwrap pp_subst, pp_substcont
Remove the temporary wrappers from pp_subst() and pp_substcont().
Note that under s///e, any arguments that were on the stack on entry
to pp_subst() are now left on there until the final call to pp_substcont,
who's responsibility it is now to pop them. This is so that they don't
get prematurely freed on PERL_RC_STACK builds.
David Mitchell [Mon, 11 Dec 2023 21:40:50 +0000 (21:40 +0000)]
make RC-stack-aware: unwrap pp_prtf, pp_sprintf
Remove the temporary wrappers from pp_prtf() (which implements the perl
'printf' function but saves two whole letters!) and pp_sprintf.
David Mitchell [Mon, 11 Dec 2023 11:36:55 +0000 (11:36 +0000)]
use rpp_foo_NN() and rpp_foo_IMM() widely
Make more use of the recently-added _NN and _IMM_NN variants of common
functions throughout the pp*.c files. The _NN ones assume anything being
popped of the stack is non-NULL, so that check can be skipped for each
SV being popped. The _IMM variants mean that the one item being put on
the stack is an immortal like &PL_sv_undef, so doesn't need its
reference count adjusting.
So these are all just small optimisations.
David Mitchell [Mon, 11 Dec 2023 11:37:37 +0000 (11:37 +0000)]
add _IMM variants to the rpp_foo() fns
These new function variants assume that the item being put on the stack
is one of the immortals (PL_sv_undef/yes/no/zero), and so skips
incrementing their reference count. This is for a minor efficiency
saving, rather than being necessary for correct functioning of the code.
This commit also tidies up a few of the related rpp_ functions: in
particular moving asserts out of the PERL_RC_STACK-only code into the
general code: an rpp_foo_NN() function should assert fail on a null SV
regardless of whether perl has been compiled under PERL_RC_STACK or not.
David Mitchell [Wed, 6 Dec 2023 20:46:09 +0000 (20:46 +0000)]
Resurrect immortals before checking for SvTEMP()
sv_clear() and sv_free2() both do, in this order, (simplified):
#ifdef DEBUGGING
if (SvTEMP(sv))
Perl_ck_warner_d(..., Attempt to free temp prematurely",...);
#endif
if (SvIMMORTAL(sv))
SvREFCNT(sv) = SvREFCNT_IMMORTAL
Now, it so happens that under DEBUGGING PERL_RC_STACK builds,
a) immortals such s PL_sv_undef have their refcount set to only 10 to
deliberately trigger the edge case of them being freed more often;
b) PERL_RC_STACK builds increasingly don't bother to increment the
reference counts of immortals when pushing them on the stack - this
saves a bit of time, and just means that once every two billion times on
normal builds the ref count drops to zero and sv_clear() sets it back to
SvREFCNT_IMMORTAL.
The combination of these has suddenly made it much more likely that
an immortal on the stack which has also been mortalised, will be passed
to sv_clear() and thus spuriously output the warning message.
So this commit swaps the order of the checks.
In the SvIMMORTAL() branch, it now also turns off SvTEMP.
David Mitchell [Sat, 9 Dec 2023 10:13:00 +0000 (10:13 +0000)]
fix minor leak under use feature 'module_true'
Under PERL_RC_STACK builds, any return value from the module, i.e.
the
1;
or other final statement value, would leak.
David Mitchell [Fri, 8 Dec 2023 23:20:14 +0000 (23:20 +0000)]
fix obscure leak in sort { block } ...
This only leaked on PERL_RC_STACK builds, and only in the relatively
rare code path of a sort block which included a nested scope (such as
a for loop), and which then used 'return' to return the value.
David Mitchell [Thu, 7 Dec 2023 18:43:05 +0000 (18:43 +0000)]
fix leak in list const folding under PERL_RC_STACK
S_gen_constant_list() wasn't taking account of the stack possibly being
reference-counted, and so when a list was being constant-folded into an
AV, that AV would leak, such as
while (1) { my $x = eval '\(1..3)'; }
David Mitchell [Tue, 5 Dec 2023 20:17:44 +0000 (20:17 +0000)]
pp_sort: fix leak in PERL_RC_STACK inline sorting
For the optimised case where the src and dst are both the same array,
e.g.
@a = sort { ... } @a;
pp_sort() optimises this. When the code was modified to run under
PERL_RC_STACK, I introduced a leak: all the SVs on the stack after
sorting were then stored in the array and their ref counts incremented,
then the stack pointer was reset *without* decrementing the ref count of
each SV. So every SV in the array by the time pp_sort() returned had a
reference count one too high.
The fix is trivial - don't bump the ref counts when storing them in
the array.
Karl Williamson [Wed, 27 Dec 2023 13:11:52 +0000 (06:11 -0700)]
locale.c: handle codesets GB18030, EUC-TW
These two codesets are multi-byte, like UTF-8. Previously we sort of-of
looked for them, but not fully. This commit rectifies that. If the
code set matches one of those two, we don't create a bias towards the
locale being UTF-8.
I discovered this by testing in an unusual Configuration.
Karl Williamson [Thu, 21 Dec 2023 15:25:23 +0000 (08:25 -0700)]
locale.c: Minimize time spent with a toggled locale
It's better to disturb things the least amount of time as possible.
Here it's just as convenient to reorder things so the untoggling is done
sooner.
Karl Williamson [Wed, 27 Dec 2023 13:10:57 +0000 (06:10 -0700)]
locale.c: toggle LC_CTYPE in S_override
Commit
0b60dbbe529b372662069aaadf3dfcf18f85c1cc missed this. Most of
this function must be done in the requested locale so that the libc
functions work on the correct underlying locale.
Karl Williamson [Wed, 27 Dec 2023 19:06:26 +0000 (12:06 -0700)]
locale.c: toggling locales is a no-op if no locales
When the only legal locale is C, toggling to another locale doesn't make
sense. By #defining the macro that implements toggling this to do
nothing in this case, we can avoid some #ifdefs
Karl Williamson [Wed, 27 Dec 2023 17:37:09 +0000 (10:37 -0700)]
locale.c: C is the only locale under NO_LOCALE
It is possible to Configure perl to not pay attention to locales at all.
Effectively that means the only permissible locale is "C", which
underlies all C programs at startup.
Thus, when asked what the current locale is, the answer is always going
to be "C"; and we can define the macro that computes this info to just
return "C" instead of doing any lookup.
Karl Williamson [Sun, 31 Dec 2023 05:00:53 +0000 (22:00 -0700)]
locale.c: Be sure to toggle into dot radix locale
This fixes GH #21746
Perl keeps the LC_NUMERIC category in a locale where the radix character
is a dot, regardless of what the user has requested. This is because
much XS code has been written with the dot assumption. When the user's
actual radix character is desired, the locale is briefly toggled to that
one for the duration of the operation.
When the user changes the LC_NUMERIC locale, the new one is noted, but
the attempted change is otherwise ignored unless its radix is a dot.
The new one will be briefly toggled into when appropriate.
The blamed commit contains a logic error
commit
818cdb7aa9f85227c1c7313257c6204c872beb94
Author: Karl Williamson <khw@cpan.org>
AuthorDate: Sun Apr 11 05:57:07 2021 -0600
Commit: Karl Williamson <khw@cpan.org>
CommitDate: Thu Sep 1 09:02:04 2022 -0600
locale.c: Skip code if will be a no-op
It decided it was a no-op if the new locale that the user is changing to
is the same as the previous locale. But it didn't consider that what
actually happens is that the new locale does actually get changed, and
this code is supposed to make sure that, before returning control to the
user, that a dot radix locale is in effect.
If the new locale is a dot radix locale, then no harm is done by
skipping the code, but otherwise things can go wrong.
I am chagrined that I made this logic error without noticing before it
got pushed, and am surprised that it took this long for the error to
surrface. There must be something else intervening to make this not a
problem in most circumstances, but I haven't analyzed what it might be.
The details as to why it happened in this test case are pretty obscure.
The locale in effect is looking for a comma radix, but what is being
checked for is a Perl version number, like 5.0936. When converting that
to a floating point number, the dot is not recognized, and only the
initial '5' is found. The failing code in a module has different
actions depending on the current perl version it is being called from,
and the conditional got the answer wrong because 5 is less than 5.0936,
whereas the actual version is above that. So it did the wrong thing and
caused an error.
Karl Williamson [Tue, 2 Jan 2024 20:31:00 +0000 (13:31 -0700)]
locale.c: This label is only used in WIN32
Karl Williamson [Sun, 3 Dec 2023 17:47:42 +0000 (10:47 -0700)]
locale.c: Refactor an #ifdef
I have some #ifdef'd code that when enabled makes the locale handling
think it's running on a Windows MingW machine. It doesn't emulate the
whole platform by any means, but it does reproduce the different logic
that is required for Windows in locale.c, and allows checking that
changes made likely will compile there without having to actually go to
a Windows machine.
It is mostly hidden from the rest of the code, except in the one spot
where it gets set up. This makes it unobtrusive, but more importantly
maximizes the chances of it faithfully doing what Windows would do.
But there are two places in locale.c where I couldn't completely hide
it. One is in teardown to avoid a leak, and the other is this spot in
the code where the codeset names are calculated. There are big
differences in the Windows name syntax (integers) from the POSIX ones
(character strings).
This commit uses a macro to convert the integers into strings, so that
the required #ifdef doesn't interrupt the logic flow.
Karl Williamson [Tue, 4 Apr 2023 22:18:58 +0000 (16:18 -0600)]
Define setlocale_i() on unsafe threaded builds
On threaded Configurations where thread-safe locale handling is not
available, perl automatically does a modicum of prevention of races by
executing locale changes in a critical section, and copying the results
to a thread-safe location.
This commit defines setlocale_i() on such builds. This macro is used to
bypass more complex handling required in fully thread-safe builds, and
is used where the libc setlocale() can be wrapped such that it works for
both querying what the existing locale is, and changing the locale.
Karl Williamson [Fri, 7 Apr 2023 15:51:37 +0000 (09:51 -0600)]
locale.c: Add a debugging statement
This can be helpful in tracing what's happening with nl_langinfo()
calls.
Karl Williamson [Tue, 13 Jun 2023 17:38:56 +0000 (11:38 -0600)]
locale.c: Change some -DLv statements to -DL
These debug statements show something isn't quite normal, so shouldn't
require a verbose option to be displayed
Karl Williamson [Tue, 2 Jan 2024 23:02:37 +0000 (16:02 -0700)]
locale.c: Rmv duplicate strlen()
Inadvertently introduced in
16c984f24273a831a74a01b900d5a400ae331c5d
Karl Williamson [Mon, 25 Dec 2023 15:11:47 +0000 (08:11 -0700)]
Experimentally enable glibc undocumented querylocale()
This is querylocale() by another name, and is undocumented, hence we
haven't enabled it by default. But it seems to work fine. In order to
gain wider experience in using it, it is here default-enabled through
5.39.9 (unless we decide to end the experiment earlier), at which point
a compilation error will remind us to decide to keep it or take it out.
I put the check in locale.c instead of the more obvious perl.h, because
the definition would come earlier in perl.h than the PERL_VERSION macros
are defined, and I don't think its worth moving things around for just a
potential of a few releases.
Karl Williamson [Mon, 25 Dec 2023 14:42:51 +0000 (07:42 -0700)]
Revert "Experimentally enable glibc undocumented querylocale()"
This reverts commit
2ba88c8c7f1c33fe9f3145cbd2c4de3b1668efe9.
It turns out that that commit causes a porting test failure on the rare
Configuration where the POSIX 2008 locale API is used on a system
without threads. (Someone might want to do that on platforms where
setlocale() is buggy.)
The next commit fixes this test. Normally that would just be a
follow-on commit without this reversion. But to ensure that when the
time comes to revert, the whole process is just one commit, this
reversion is done, and the next commit reinstates this reverted commit
plus adding the fix.
Karl Williamson [Tue, 19 Dec 2023 21:58:13 +0000 (14:58 -0700)]
Hoist nl_item typedef definition
nl_item is a typedef defined in <langinfo.h> for use by nl_langinfo().
But on platforms without this, perl emulates it, and hence needs to
create its own nl_item typedef.
Prior to this commit, the definition was in locale.c, which meant that
there needed to be two definitions in embed.fnc for each function that
has an argument of this type.
Simply putting it in "perl_langinfo.h" when there is no <langinfo.h>
allows those duplicate definitions to be removed