This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
James E Keenan [Fri, 3 Feb 2023 00:19:35 +0000 (00:19 +0000)]
Correct code-like snippet in documentation
Make it a close-bracket to pair with open-bracket.
Bump $VERSION in Math-Complex .pm files to keep porting tests happy.
Yves Orton [Tue, 31 Jan 2023 07:01:13 +0000 (08:01 +0100)]
t/29a_upcopy.t - under parallel builds allow more time for test
Karl has reported that he has issues with t/29a_upcopy.t under parallel
builds. I can not see any file based race conditions, but I can see code using
alarm around a test I can easily imagine would be too short for a loaded box
running many tests in parallel. This patch allows the test to use 20 seconds
instead of 10 if TEST_JOBS or HARNESS_OPTIONS are defined in the environment.
Hopefully this fixes tests on Karls box.
In a previous commit Dave M raised this from 5 to 10 seconds, so lets double
it again and see if Karls errors go away.
In an abundance of caution I also adjusted the other two cases of using
alarm() in this file to use the same logic and produce similar style
error messages.
Yves Orton [Tue, 31 Jan 2023 04:01:53 +0000 (05:01 +0100)]
embed.pl - the 's', 'S', 'i' and 'I' flags are mutually exclusive
We had a bug where we processed the first one in the flags definition. Sorting
the flags or rearranging them changes the output, which shouldn't happen.
This also fixes the handling and specification of PerlEnv_putenv(), which was
marked "si" when it should have been marked "i". This required changing its
implementation from a Perl_ prefix to a S_ prefix and regenerating.
I have run embed.pl in a loop with a local patch to shuffle the flags to see
if there were any other order dependencies. No output files changed so I
assume with this patch we are free of such bugs.
Yves Orton [Mon, 30 Jan 2023 01:55:45 +0000 (02:55 +0100)]
File-Find/t - rework tempdir creation and cleanup
Fixes #20734 - in a previous patch I missed the early cleanup call
and the fact that it could result in a race condition. This hopefully
resolves the problem.
These tests files are pretty crufty. It would be nice to see them split
apart so that the "sanity" checks which expect to be run in t/ are
executed in a separate test files from the checks which build a tree to
traverse for testing. A perfect task for a new contributor.
Elvin Aslanov [Sun, 29 Jan 2023 19:52:14 +0000 (23:52 +0400)]
UNIVERSAL.pm - Use lexical variables in Synopsis
Add `my`, and name variables differently to avoid confusion.
Todd Rinaldo [Mon, 30 Jan 2023 17:14:04 +0000 (17:14 +0000)]
Update autodie to CPAN version 2.36
[DELTA]
2.36 2023-01-30 16:48:23+00:00 UTC
* Remove the use of ' as a package separator.
* Fix spelling errors in POD.
Kurt Fitzner [Sun, 29 Jan 2023 14:53:53 +0000 (15:53 +0100)]
Configure - fix handling of quoted gcc output
This patch was submitted in GH issue #20606. When gcc output contains
quoted elements we fail to handle it properly. This tweaks the sed
command to do so.
Fixes #20606.
Yves Orton [Sun, 29 Jan 2023 15:15:46 +0000 (16:15 +0100)]
win32/*akefile - delete before rename
All of the other rename commands in win32/Makefile and win32/GNUmakefile
are guarded by a del statement. This does the equivalent for the
rename command that creates config.sh.
Fixes #20749.
Aristotle Pagaltzis [Sun, 29 Jan 2023 10:19:59 +0000 (11:19 +0100)]
Update Memoize to 1.16
Paul "LeoNerd" Evans [Sat, 28 Jan 2023 14:34:52 +0000 (14:34 +0000)]
Make the new optree function declarations conditional on PERL_CORE|PERL_USE_VOLATILE_API
Paul "LeoNerd" Evans [Fri, 27 Jan 2023 15:28:52 +0000 (15:28 +0000)]
Expose {optimize,finalize}_optree() as real API functions
These are required by XS modules which want to create custom
LOGOP-shaped optrees, to ensure that both sides of the tree get
optimised and finalised.
See also
https://github.com/Perl/perl5/issues/20743
Todd Rinaldo [Fri, 27 Jan 2023 16:20:13 +0000 (16:20 +0000)]
Update autodie to CPAN version 2.35
[DELTA]
2.35 2023-01-27 16:00:26+00:00 UTC
* Prepare for 5.38 changes to deprecate smartmatch
* Remove +x bit from pm and t files
* CI - Turn off Pod coverage and critic tests below 5.12
Yves Orton [Thu, 26 Jan 2023 09:27:10 +0000 (10:27 +0100)]
pp_hot.c - fix branch reset matches in list context
I am kinda surprised this issue was not picked up by one of our
other test files. I would have expected one of the t/re/regexp.t
based patches to validate list context matches populating the list
properly. But apparently not! So when I fixed branch reset in
fe5492d916201ce31a107839a36bcb1435fe7bf0 I missed the list context
logic. This fixes the oversight. Thanks to Andreas Koenig for the
BBC report on this.
This also changes the code to use SSize_t for various length related
operations, the original code was using I32 which might break on
very very long strings. Thanks to Tony C for pointing that out.
Scott Baker [Tue, 24 Jan 2023 18:12:38 +0000 (10:12 -0800)]
Add some closing parens I missed in the first pass
Tony Cook [Tue, 24 Jan 2023 23:07:32 +0000 (10:07 +1100)]
Tony Cook [Mon, 23 Jan 2023 04:28:12 +0000 (15:28 +1100)]
check the IO object exists when writing to IO magic variables
pp_select() ensures that the GV in PL_defoutgv has an IO object
when it the default output is set, but can't prevent that GV
being cleared afterwards, resulting in a seg fault when the
variable is written.
To prevent this, check PL_defoutgv has an IO object before trying
to write to it.
Fixes #20733
Scott Baker [Tue, 24 Jan 2023 00:04:01 +0000 (16:04 -0800)]
Put the example after the explanation
Scott Baker [Mon, 23 Jan 2023 23:56:06 +0000 (15:56 -0800)]
Add some examples to `unshift()` docs
Yves Orton [Fri, 20 Jan 2023 15:19:36 +0000 (16:19 +0100)]
dump.c - dump new regexp fields properly
Show the pointer values and their contents. Also show the "MOTHER_RE"
at the *end* of the dump, as otherwise it can be quite hard to read.
This patch also includes stripping out the versioned test adjustments
for regexp related dumps. Devel-Peek is in ext/ so it won't be used on
an older perl and we can just make it correct for the latest state.
The test for the dump of a branch reset pattern is also implicitly
tests whether branch reset pointer table logic is working correctly.
In the process of writing this patch I discovered there was an off by
one error. See
8111bf2fc3870f8146bb46652b66bd517e82b4dd for the fix.
Yves Orton [Fri, 20 Jan 2023 09:16:52 +0000 (10:16 +0100)]
hv_macro.h - fix comment
Some weird typos there, this fixes them to be correct.
Yves Orton [Fri, 20 Jan 2023 15:15:07 +0000 (16:15 +0100)]
regcomp.c - fix fencepost error duping a regex
In
fe5492d916201ce31a107839a36bcb1435fe7bf0 I made a fencepost error
copying the logical_to_parno and related data structures. They all needed a
+1 on their size as the paren counts (logical and physical) as they have
to account for capture buffer 0 which is always present which represents
the entire match.
reneeb [Fri, 20 Jan 2023 23:42:50 +0000 (00:42 +0100)]
Prepare Module::CoreList for 5.37.9
reneeb [Fri, 20 Jan 2023 23:22:13 +0000 (00:22 +0100)]
bump version for 5.37.9
reneeb [Fri, 20 Jan 2023 16:41:57 +0000 (17:41 +0100)]
New perldelta for 5.37.9
reneeb [Fri, 20 Jan 2023 16:37:53 +0000 (17:37 +0100)]
tick release of 5.37.8 in Porting/release_schedule.pod
reneeb [Fri, 20 Jan 2023 16:31:11 +0000 (17:31 +0100)]
update epigraphs.pod
reneeb [Fri, 20 Jan 2023 16:14:50 +0000 (17:14 +0100)]
Merge branch 'release-5.37.8' into blead
Hugo van der Sanden [Thu, 19 Jan 2023 00:27:31 +0000 (00:27 +0000)]
bisect-runner docs: explain more about bisection
Hugo van der Sanden [Wed, 18 Jan 2023 23:58:19 +0000 (23:58 +0000)]
bisect-runner docs: modify example to use 'expect-fail'
reneeb [Fri, 20 Jan 2023 10:56:57 +0000 (11:56 +0100)]
Add new release to perlhist
reneeb [Fri, 20 Jan 2023 10:44:33 +0000 (11:44 +0100)]
finalize perldelta for 5.37.8
reneeb [Fri, 20 Jan 2023 09:56:23 +0000 (10:56 +0100)]
Update Module::CoreList for 5.37.8
reneeb [Fri, 20 Jan 2023 08:46:00 +0000 (09:46 +0100)]
Update copyright years
Yves Orton [Thu, 19 Jan 2023 15:36:05 +0000 (16:36 +0100)]
regexec.c - harden internals against missing logical_nparens
We can default a 0 rx->logical_nparens to rx->nparens. If rx->logical_nparens
is zero then either rx->nparens is also zero, or it can be defaulted. This
will fix most re::engine::XXX modules that do not know about the new field,
provided they zero the rx structure during construction. If they don't then
this patch won't hurt anything and we will have to patch them directly.
Also mark re_op_compile() as available to extensions. Marking it as hidden
means that re::engine::PCRE2 and others cannot build.
This patch should go a long way towards fixing issue #20710.
David Cantrell [Thu, 19 Jan 2023 20:22:04 +0000 (20:22 +0000)]
Export S_ISLNK and S_ISSOCK from POSIX.pm
They are already available in Fcntl.pm, from whence POSIX.pm already
gets all the other S_IS* macros, so this just adds them to the list
of imports (from Fcntl into POSIX) and exports (from POSIX)
reneeb [Fri, 20 Jan 2023 00:28:36 +0000 (01:28 +0100)]
update podlators to 5.01
reneeb [Thu, 19 Jan 2023 23:07:19 +0000 (00:07 +0100)]
Update IO::Zlib to 1.14
reneeb [Thu, 19 Jan 2023 22:38:37 +0000 (23:38 +0100)]
update Config::Perl::V to 0.35
reneeb [Thu, 19 Jan 2023 22:02:45 +0000 (23:02 +0100)]
update JSON::PP to 4.16
Yves Orton [Sun, 8 Jan 2023 14:49:04 +0000 (15:49 +0100)]
regcomp.c - add optimistic eval (*{ ... }) and (**{ ... })
This adds (*{ ... }) and (**{ ... }) as equivalents to (?{ ... }) and
(??{ ... }). The only difference being that the star variants are
"optimisitic" and are defined to never disable optimisations. This is
especially relevant now that use of (?{ ... }) prevents important
optimisations anywhere in the pattern, instead of the older and inconsistent
rules where it only affected the parts that contained the EVAL.
It is also very useful for injecting debugging style expressions to the
pattern to understand what the regex engine is actually doing. The older
style (?{ ... }) variants would change the regex engines behavior, meaning
this was not as effective a tool as it could have been.
Similarly it is now possible to test that a given regex optimisation
works correctly using (*{ ... }), which was not possible with (?{ ... }).
Tony Cook [Tue, 17 Jan 2023 23:47:39 +0000 (10:47 +1100)]
op/fork.t: skip the ulimit fork test under LSAN
This was producing noise, at least on Linux, since the -u limit on
Linux also limits threads.
Fixes #20712
lilinjie [Tue, 17 Jan 2023 06:05:35 +0000 (14:05 +0800)]
Fix typo in older perldelta
Signed-off-by: lilinjie <lilinjie@uniontech.com>
Committer: Li Linjie is now a Perl author.
Run Porting/updateAUTHORS.pl to update .mailmap
Florian Weimer [Tue, 17 Jan 2023 18:07:54 +0000 (19:07 +0100)]
Configure: Add various C99 compatibility improvements
Two C99 compatibility issues are fixed by these changes: Return
types are made explicit where they previously defaulted to int,
and all called functions are now declared explicitly (either by
including additional headers, or by adding prototypes manually).
This avoids implict ints and implicit function declarations,
both legacy C language features removed in the 1999 revision
of the language.
Verified with an instrumented GCC compiler on GNU/Linux.
Karl Williamson [Tue, 17 Jan 2023 18:08:40 +0000 (11:08 -0700)]
Peter Levine is now a Perl author.
Peter Levine [Tue, 17 Jan 2023 07:47:12 +0000 (02:47 -0500)]
Add parameter types to declarations for clang-16
ANSI C style function declarations without parameter types are errors with clang-16.
James E Keenan [Fri, 13 Jan 2023 00:44:11 +0000 (00:44 +0000)]
bisect-runner.pl: When did warnings clear up
Illustrate bisection to identify commit at which run-time warnings
ceased being emitted from a program.
Tony Cook [Thu, 12 Jan 2023 00:35:25 +0000 (11:35 +1100)]
pp_multiconcat: don't set svpv_p to an invalid pointer
When svpv_base == svpv_buf, svpv_p would be set to point before the
buffer, which is undefined.
This appears to be what gcc 13 is complaining about in #20678,
despite that report including what appears to be a completely valid
address, on a line where the value of svpv_p is now within the range
of svpv_buf.
An intermediate approach to this used:
temp = svpv_p;
if (svpv_p++ == svpv_end)
break
but this is also incorrect, since svpv_p would end up as an invalid
pointer, though gcc UBSAN didn't pick that up.
Fixes #20678.
Dagfinn Ilmari Mannsåker [Mon, 16 Jan 2023 14:07:52 +0000 (14:07 +0000)]
Remove full stop in the 'try' feature heading
None of the other headings in feature.pm have full stops.
Elvin Aslanov [Sat, 14 Jan 2023 13:32:50 +0000 (17:32 +0400)]
Override *.h files as C with Linguist
GitHub classifies 23 files as C++ for some reason.
https://github.com/Perl/perl5/search?q=language%3AC%2B%2B&type=code
I believe Perl doesn't contain C++ code, and C++ headers can have a distinguishable .hh, .hpp, .hxx, and .h++ extensions.
Yves Orton [Sun, 8 Jan 2023 10:35:38 +0000 (11:35 +0100)]
regcomp_study.c - disable CURLYX optimizations when EVAL has been seen anywhere
Historically we disabled CURLYX optimizations when they
*contained* an EVAL, on the assumption that the optimization might
affect how many times, etc, the eval was called. However, this is
also true for CURLYX with evals *afterwards*. If the CURLYN or CURLYM
optimization can prune off the search space, then an eval afterwards
will be affected. An when you take into account GOSUB, it means that
an eval in front might be affected by an optimization after it.
So for now we disable CURLYN and CURLYM in any pattern with an EVAL.
Yves Orton [Mon, 9 Jan 2023 20:09:29 +0000 (21:09 +0100)]
regcomp.c - increase size of CURLY nodes so the min/max is a I32
This allows us to resolve a test inconsistency between CURLYX and CURLY
and CURLYM, which have different maximums. We use I32 and not U32 because
the existing count logic uses -1 internally and using an I32 for the min/max
prevents warnings about comparing signed and unsigned values when the
count is compared against the min or max.
Yves Orton [Mon, 9 Jan 2023 19:37:28 +0000 (20:37 +0100)]
regexec.c - fix accept in CURLYX/WHILEM construct.
The ACCEPT logic didnt know how to handle WHILEM, which for
some reason does not have a next_off defined. I am not sure why.
This was revealed by forcing CURLYX optimisations off. This includes
a patch to test what happens if we embed an eval group in the tests
run by regexp.t when run via regexp_normal.t, which disabled CURLYX ->
CURLYN and CURLYM optimisations and revealed this issue.
This adds t/re/regexp_normal.t which test "normalized" forms of
the patterns in t/re/re_tests by munging them in various ways
to see if they still behave as expected. For instance injecting
a (?{}) can disable optimisations.
Yves Orton [Sat, 14 Jan 2023 10:46:03 +0000 (11:46 +0100)]
regexec.c - fix memory leak in EVAL.
EVAL was calling regcppush twice per invocation, once before executing the
callback, and once after. But not regcppop'ing twice. So each time we
would accumulate an extra "frame" of data. This is/was hidden somewhat by
the way we eventually "blow" the stack, so the extra data was just thrown
away at the end.
This removes the second set of pushes so that the save stack stays a stable
size as it unwinds from each failed eval.
We also weren't cleaning up after a (?{...}) when we failed to match to its
right. This unwinds the stack and restores the parens properly.
This adds tests to check how the save stack grows during patterns using
(?{ ... }) and (??{ ... }) and ensure that when we backtrack and re-execute
the EVAL again it cleans up the stack as it goes.
Yves Orton [Mon, 9 Jan 2023 21:28:19 +0000 (22:28 +0100)]
regcomp_trie.c - use the indirect types so we are safe to changes
We shouldnt assume that a TRIEC is a regcomp_charclass. We have a per
opcode type exactly for this type of use, so lets use it.
Yves Orton [Mon, 9 Jan 2023 21:11:36 +0000 (22:11 +0100)]
regcomp.c - add whitespace to binary operation
The tight & is hard to read.
Yves Orton [Mon, 9 Jan 2023 21:10:42 +0000 (22:10 +0100)]
regcomp.h - get rid of EXTRA_STEP defines
They are unused these days.
Yves Orton [Mon, 9 Jan 2023 20:59:24 +0000 (21:59 +0100)]
regexec.c - rework CLOSE_CAPTURE() to take rex as an arg to enable reuse.
This also splits up CLOSE_CAPTURE() into two parts, with the important parts
implemented by CLOSE_ANY_CAPTURE(), and the debugging parts in
CLOSE_CAPTURE(). This allows it to be used in contexts where the regexp
structure isn't set up under the name 'rex', and where the debugging output it
includes might not be relevant or possible to produce.
This encapsulates all the places that "close" a capture buffer, and ensures
that they are closed properly. One important case in particular cannot use
CLOSE_CAPTURE() directly, as it does not have a 'rex' variable in scope (it is
called prog in this function), nor the debugging context used in normal
execution of CLOSE_CAPTURE(). Using CLOSE_ANY_CAPTURE() instead means all the
main points that update capture buffer state use the same macro API.
Yves Orton [Mon, 9 Jan 2023 20:27:14 +0000 (21:27 +0100)]
regcomp_study.c - Add a way to disable CURLYX optimisations
Also break up the condition so there is one condition per line so
it is more readable, and fold repeated binary tests together. This
makes it more obvious what the expression is doing.
Yves Orton [Mon, 9 Jan 2023 22:20:04 +0000 (23:20 +0100)]
t/re/re_rests - extend test to show more buffers
This is a tricky test, showing more buffers makes it a bit easier
to understand if you break it. (Guess what I did?)
Benjamin Smith [Sat, 14 Jan 2023 20:22:43 +0000 (20:22 +0000)]
Porting/updateAUTHORS.pl: Suggested `git config` command is wrong
When you run tests in a repository that has local modifications
`t/porting/authors.t` checks if your git is correctly configured
with your identity. However, the suggested commands don't work,
because `git config` doesn't need a `--set` flag to set options.
This patch removes the `--set` from the suggested commands which is
not necessary to set the variables in the local git repository. For
comparison, if you run `git commit` without your identity set up, it
suggests:
```
Run
git config --global user.email "you@example.com"
git config --global user.name "Your Name"
to set your account's default identity.
Omit --global to set the identity only in this repository.
```
James E Keenan [Sat, 14 Jan 2023 18:20:20 +0000 (13:20 -0500)]
perldelta for
67244d99e1 and earlier commits
Commits for Math-Complex upgrade to 1.60
Yves Orton [Fri, 13 Jan 2023 19:37:01 +0000 (20:37 +0100)]
embed.pl - add a way to declare a parameter should be non-zero
This autogenerates the required asserts to validate a parameter is non-zero.
It uses it to replace the check in regrepeat() as a first example.
Paul "LeoNerd" Evans [Thu, 12 Jan 2023 17:08:15 +0000 (17:08 +0000)]
Expose op_force_list() as a real API function; use it directly in op.c
Andy Lester [Sat, 14 Jan 2023 04:54:12 +0000 (22:54 -0600)]
Correct sort docs about value of the sort func.
Paul "LeoNerd" Evans [Fri, 13 Jan 2023 19:57:22 +0000 (19:57 +0000)]
get_{av,hv} should ignore SVf_UTF8 when testing if flags == 0
Richard Leach [Fri, 13 Jan 2023 18:04:38 +0000 (18:04 +0000)]
t/test.pl - assign caller() to separate vars in _where()
_where() is a heavily used, small subroutine within test.pl. It currently
assigns all eleven of caller()'s return values to array elements, then
uses only the second and third of these to construct a return string.
This commit changes the assignment to use separate variables, which ends
up being more efficient.
Benchmarking suggests that _where() is about 30% faster as a result.
The overhead of ok( ... ) calls is reduced by about 10%.
Yves Orton [Fri, 13 Jan 2023 17:23:01 +0000 (18:23 +0100)]
regexec.c - avoid calling regrepeat when the max is 0
When we have a max quantifier of 0, then the quantified
item is essentially a NOTHING reop. Regardless, we do not
need to call regrepeat, and doing so confuses some of the
logic it contains. Simply avoiding calling regrepeat()
fixes the underlying issue, and avoids the broken code.
This fixes GH Issue #20690.
Ferenc Erki [Fri, 13 Jan 2023 14:24:35 +0000 (15:24 +0100)]
Add myself as author
Ferenc Erki [Fri, 13 Jan 2023 14:10:08 +0000 (15:10 +0100)]
Fix typos
Karl Williamson [Fri, 13 Jan 2023 01:45:44 +0000 (18:45 -0700)]
regcomp_internal.h: Fix leak in regex tests
Commit
fe5492d916201ce31a107839a36bcb1435fe7bf0 introduced leaks when a
regex compilation fails. This commit uses the standard method we have
to deal with these kinds of things.
Karl Williamson [Sat, 31 Dec 2022 20:08:42 +0000 (13:08 -0700)]
POSIX.pod: Clarify mbtowc(), wctomb() pod
Karl Williamson [Tue, 13 Dec 2022 02:51:40 +0000 (19:51 -0700)]
Fix PerlEnv_putenv threaded compilation on Windows
A second compilation of a workspace would fail. The first one would
succeed because miniperl was being used, which isn't threaded.
Karl Williamson [Sat, 31 Dec 2022 20:03:04 +0000 (13:03 -0700)]
Don't panic if can't destroy mutex during global destruction
It's going to be destroyed anyway; this just obscures what the real
failure might be.
Elvin Aslanov [Thu, 12 Jan 2023 09:53:47 +0000 (13:53 +0400)]
Syntax highlight configpm
For enhancing the source readability in GitHub:
cf. https://raw.githubusercontent.com/Perl/perl5/blead/PACKAGING
Yves Orton [Thu, 29 Dec 2022 11:07:22 +0000 (12:07 +0100)]
regcomp.c etc - rework branch reset so it works properly
Branch reset was hacked in without much thought about how it might interact
with other features. Over time we added named capture and recursive patterns
with GOSUB, but I guess because branch reset is somewhat esoteric we didnt
notice the accumulating issues related to it.
The main problem was my original hack used a fairly simple device to give
multiple OPEN/CLOSE opcodes the same target buffer id. When it was introduced
this was fine. When GOSUB was added later however, we overlooked at that this
broke a key part of the book-keeping for GOSUB.
A GOSUB regop needs to know where to jump to, and which close paren to stop
at. However the structure of the regexp program can change from the time the
regop is created. This means we keep track of every OPEN/CLOSE regop we
encounter during parsing, and when something is inserted into the middle of
the program we make sure to move the offsets we store for the OPEN/CLOSE data.
This is essentially keyed and scaled to the number of parens we have seen.
When branch reset is used however the number of OPEN/CLOSE regops is more than
the number of logical buffers we have seen, and we only move one of the
OPEN/CLOSE buffers that is in the branch reset. Which of course breaks things.
Another issues with branch reset is that it creates weird artifacts like this:
/(?|(?<a>a)|(?<b>b))(?&a)(?&b)/ where the (?&b) actually maps to the (?<a>a)
capture buffer because they both have the same id. Another case is that you
cannot check if $+{b} matched and $+{a} did not, because conceptually they
were the same buffer under the hood.
These bugs are now fixed. The "aliasing" of capture buffers to each other is
now done virtually, and under the hood each capture buffer is distinct. We
introduce the concept of a "logical parno" which is the user visible capture
buffer id, and keep it distinct from the true capture buffer id. Most of the
internal logic uses the "true parno" for its business, so a bunch of problems
go away, and we keep maps from logical to physical parnos, and vice versa,
along with a map that gives use the "next physical parno with the same
logical parno". Thus we can quickly skip through the physical capture buffers
to find the one that matched. This means we also have to introduce a
logical_total_parens as well, to complement the already existing total_parens.
The latter refers to the true number of capture buffers. The former represents
the logical number visible to the user.
It is helpful to consider the following table:
Logical: $1 $2 $3 $2 $3 $4 $2 $5
Physical: 1 2 3 4 5 6 7 8
Next: 0 4 5 7 0 0 0 0
Pattern: /(pre)(?|(?<a>a)(?<b>b)|(?<c>c)(?<d>d)(?<e>e)|(?<f>))(post)/
The names are mapped to physical buffers. So $+{b} will show what is in
physical buffer 3. But $3 will show whichever of buffer 3 or 5 matched.
Similarly @{^CAPTURE} will contain 5 elements, not 8. But %+ will contain all
6 named buffers.
Since the need to map these values is rare, we only store these maps when they
are needed and branch reset has been used, when they are NULL it is assumed
that physical and logical buffers are identical.
Currently the way this change is implemented will likely break plug in regexp
engines because they will be missing the new logical_total_parens field at
the very least. Given that the perl internals code is somewhat poorly
abstracted from the regexp engine, with parts of the abstraction leaking out,
I think this is acceptable. If we want to make plug in regexp engines work
properly IMO we need to add some more hooks that they need to implement than
we currently do. For instance mg.c does more work than it should. Given there
are only a few plug in regexp engines and that it is specialized work, I
think this is acceptable. We can work with the authors to refine the API
properly later.
Yves Orton [Thu, 29 Dec 2022 15:56:30 +0000 (16:56 +0100)]
handy.h - add NewCopy() macro to combine New and Copy.
Yves Orton [Thu, 29 Dec 2022 15:21:51 +0000 (16:21 +0100)]
test.pl - add support for rtriming fresh perl output
This makes it easier to do regexp debug tests, where we don't care
about trailing whitespace.
It also fixes the line number reporting for fresh_perl_is() and
fresh_perl_like() so that it shows the actual place where the line
number is located, and it changes the relevant code to work properly
with external $Level overrides.
Elvin Aslanov [Wed, 11 Jan 2023 17:01:57 +0000 (21:01 +0400)]
Replace FreeBSD URL's with new HTTPS ones
James E Keenan [Tue, 10 Jan 2023 18:48:13 +0000 (18:48 +0000)]
Hint should advise using 'make regen'
Per discussion by @demerphq in
https://github.com/Perl/perl5/pull/20682#issuecomment-
1377536039. The
'regen' programs should be run with your installed 'perl'.
Use single quote in heredoc, as $_ is no longer being interpolated (per
@JRaspass in
https://github.com/Perl/perl5/pull/20683#discussion_r1066294815).
James E Keenan [Tue, 10 Jan 2023 18:44:03 +0000 (18:44 +0000)]
Correct one character typo appearing in lib/feature.pm
Since lib/feature.pm is a generated file, the actual changes are made in
regen/feature.pl, followed by 'make regen' to regenerate lib/feature.pm
(and then followed by 'make test_porting') to confirm.
Yves Orton [Mon, 9 Jan 2023 19:49:12 +0000 (20:49 +0100)]
regexec engine - wrap and replace RX_OFFS() with better abstractions
RX_OFFS() exposes a bit too much about how capture buffers are represented.
This adds RX_OFFS_START() and RX_OFFS_END() and RX_OFFS_VALID() to replace
most of the uses of the RX_OFFS() macro or direct access to the rx->off[]
array. (We add RX_OFFSp() for those rare cases that should have direct
access to the array.) This allows us to replace this logic with more
complicated macros in the future. Pretty much anything using RX_OFFS() is
going to be broken by future changes, so changing the define allows us to
track it down easily.
Not all use of the rx->offs[] array are converted; some uses are required
for the regex engine internals, but anything outside of the regex engine
should be using the replacement macros, and most things in the regex internals
should use it also.
James E Keenan [Tue, 10 Jan 2023 02:54:02 +0000 (02:54 +0000)]
Math::Trig: POD correction
For: https://rt.cpan.org/Ticket/Display.html?id=114105
Tony Cook [Mon, 9 Jan 2023 23:32:05 +0000 (10:32 +1100)]
Tony Cook [Wed, 23 Nov 2022 05:19:40 +0000 (16:19 +1100)]
perlport: note the new behavior of symlink with / paths
Tony Cook [Wed, 23 Nov 2022 05:19:08 +0000 (16:19 +1100)]
win32_symlink(): we replace any /, we no longer need to check for them
Tony Cook [Wed, 23 Nov 2022 04:08:38 +0000 (15:08 +1100)]
win32_symlink: correctly handle linking to abs path to a directory
Fixes #20533
Tony Cook [Mon, 21 Nov 2022 23:41:19 +0000 (10:41 +1100)]
File::Find: handle \ in symbolic links on Win32
Tony Cook [Thu, 17 Nov 2022 05:58:10 +0000 (16:58 +1100)]
on win32 translate / to \ in symlink targets
Windows, or at least NTFS, doesn't appear to follow symlinks
where the target contains the POSIX directory separator "/".
To fix that translate any / to \ in symlink targets. This may
break code that checks the symlink target macthes a value set,
but I think it's more likely to fix code that blindly uses /
than break code that looks at the symlink target they just set.
Fixes #20506
Tony Cook [Mon, 9 Jan 2023 23:05:17 +0000 (10:05 +1100)]
perldelta for
8c61f504eeb9
Tony Cook [Thu, 10 Nov 2022 22:55:59 +0000 (09:55 +1100)]
make win32_lstat() return the length of the link in st_size
This is reflected in the result of lstat() in perl.
This matches POSIX behaviour.
Fixed #20476
Yves Orton [Sun, 8 Jan 2023 13:16:36 +0000 (14:16 +0100)]
regcomp.pl - fixup intflags debug data to handle gaps properly
We were not handling gaps in the sequence properly, and effectively
showing the wrong flag names or missing the last flag. Now we die if there
are any collisions or if any of the PREGf defines set more than one bit.
This also adds some crude tests to validate that intflags serialization is
working properly.
Note, extflags handles more complex scenarios and seems to handle this
gracefully already, hence the reason I haven't touched it as well.
This also tweaks a comment in lexical_debug.t which part of this was
cribbed from.
Yves Orton [Mon, 9 Jan 2023 16:56:46 +0000 (17:56 +0100)]
Makefile.SH - fix 'reonly' Makefile target to test ext/re/t/*.t properly
The .. in front of the ext/ required as the list is constructed
relative to the t/ directory of the repo.
This also enables "full steam ahead" mode when parallel jobs are
enabled. This target only tests a subset of our functionality, running
in normal mode and separating core tests from ext/ tests just slows
things down for no value.
Yves Orton [Wed, 4 Jan 2023 21:05:58 +0000 (22:05 +0100)]
av.c - av_store() do the refcount dance around magic av's
The api for av_store() says it is the callers responsibility to call
SvREFCNT_inc() on the stored item after the store is successful.
However inside of av_store() we store the item in the C level array before we
trigger magic. To a certain extent this is required because mg_set(av) needs
to be able to see the newly stored item.
But if the mg_set() or other magic associated with the av_store() operation
fails, we would end up with a double free situation, as we will long jump up
the stack above and out of the call to av_store(), freeing the mortal as we go
(via Perl_croak()), but leaving the reference to the now freed pointer in the
array. When the next SV is allocated the reference will be reused, and then we
are in a double free scenario.
I see comments in pp_aassign talking about defusing the temps stack for the
parameters it is passing in, and things like this, which at first looked
related. But that commentary doesn't seem that relevant to me, as this bug
could happen any time a scalar owned by one data structure was copied into an
array with set magic which could die. Eg, I can easily imagine XS code that
expected code like this (assume it handles magic properly) to work:
SV **svp = av_fetch(av1,0,1);
if (av_store(av2,0,*svp))
SvREFCNT_inc(*svp);
but if av2 has set magic and it dies the end result will be that both av1 and
av2 contain a visible reference to *svp, but its refcount will be 1. So I think
this is a bug regardless of what pp_aassign does.
This fixes https://github.com/Perl/perl5/issues/20675
Yves Orton [Fri, 30 Dec 2022 16:15:51 +0000 (17:15 +0100)]
update_authors.t - fixup shallow clone guards
We were not testing for errors from rev-parse, and we were improperly
passsing in a range and the --verify mode does not expect a range,
it expects a specific commit.
This changes the test to check each end of the commit range, and to
check for errors. It also uses the {commit} notation to check if the
objects actually are commits.
Max Maischein [Sun, 8 Jan 2023 09:45:46 +0000 (10:45 +0100)]
Fix name within (restored) pod/perl5370delta
Karen Etheridge [Sat, 7 Jan 2023 23:57:01 +0000 (15:57 -0800)]
restore pod/perl5370delta.pod
Somehow, in commit
60fa4bd4b1ac the perldelta template was copied into
perl5370delta.pod instead of perldelta.pod. This commit restores the
content just before that commit, which reflects what was in the actual 5.37.0
release.
Bram [Sat, 7 Jan 2023 19:06:21 +0000 (20:06 +0100)]
CI linux-i386: Replace checkout@v1 with git cmds
For linux-i386 we are/were using checkout@v1 because checkout@v2/checkout@v3
do not work inside a i386-container.
checkout@v1 however has some issues[^1] with shallow clones (`fetch-depth: 1`):
- it cause race conditions[^2]
- it makes it impossible to re-run an older job
Options we have Today:
- clone with full history: this is slow *and* influences some of the (porting)
tests.
- ditch checkout@v1 and manually clone it: we already do this for cygwin
and it's faster: 58s (for checkout@v1) vs 7s when doing it ourselves[^3].
-> Ditch checkout@v1.
[^1]: Upstream report: https://github.com/actions/runner/issues/2357
[^2]: `git clone` fails when new commits are pushed between the start of
the CI run and the start of the linux-i386 job. With our CI config
that is a time-frame of about 10 minutes (since that is how long
the 'Sanity check' takes).
[^3]: checkout@v1 always fetches *all* tags and *all* branches.
Yves Orton [Fri, 6 Jan 2023 10:05:00 +0000 (11:05 +0100)]
dist/threads - bump version
The way CI works it is easy to merge a patch that will lead to
a cmp_version test fail. If two people modify the same file, bump
it to the same new version, then there will be no conflict nor test
fail, until it is merged at which point it will fail because the
code has changed but the version hasn't.
This is a quickie fixup when I got hit by this today with threads.
Yves Orton [Thu, 29 Dec 2022 17:40:17 +0000 (18:40 +0100)]
regcomp.c - make sure CURLYM closes the capture buffer after each match
We have to close the paren immediately after each time we
match A, or conditional patterns will break.
An alternative and possibly more efficient solution would be to block
CURLYX -> CURLYM conversion when the inside contains a conditional check
just like we do with EVAL.
Fixes #7754 aka https://github.com/Perl/perl5/issues/7754
Yves Orton [Fri, 30 Dec 2022 13:04:06 +0000 (14:04 +0100)]
regcomp.c - in regdupe_internal() set up "in program" stclass properly
We were only handling "synthetic start classes", not ones that are in
the program itself, because those dont have an entry in the data array.
So after copying the program after ruling out that the regstclass is
synthetic we can assume that if its non-null it points into the program
itself, and simply set it up the copy accordingly.
Fixes https://github.com/Perl/perl5/issues/9373