This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
David Mitchell [Wed, 26 Dec 2018 12:58:06 +0000 (12:58 +0000)]
foo_cloexec() under PERL_GLOBAL_STRUCT_PRIVATE
Fix the various Perl_PerlSock_dup2_cloexec() type functions so that
t/porting/liberl.a passes under -DPERL_GLOBAL_STRUCT_PRIVATE builds.
In these builds it is forbidden to have any static variables, but each
of these functions (via convoluted macros) has a static var called
'strategy' which records, for each function, whether a run-time probe
has been done to determine the best way of achieving close-exec
functionality, and the result.
Replace them all with 'global' vars: PL_strategy_dup2 etc.
NB these vars aren't thread-safe but it doesn't really matter, as the
worst that can happen is for a redundant probe or two to be done before
a suitable "don't probe any more" value is written to the var and seen
by all the threads.
David Mitchell [Fri, 28 Dec 2018 11:29:27 +0000 (11:29 +0000)]
PERL_GLOBAL_STRUCT_PRIVATE: fix some const strings
change a couple of
const char * foo[] = { ... }
to
const char * const foo[] = { ... }
Making the string ptrs const means the whole thing is RO and doesn't
appear in data section, making porting/libperl.t happier when building
under -DPERL_GLOBAL_STRUCT_PRIVATE.
David Mitchell [Wed, 26 Dec 2018 20:50:16 +0000 (20:50 +0000)]
regcomp.c: don't include INTERN.h
This file only needs including by globals.c; it was being included
in regcomp.c too as the declarations in regcomp.h aren't included by
perl.h and thus don't get pulled into globals.c. This was a confusing
and hacky workaround.
Instead, this commit causes globals.c to #include regcomp.h directly
After this commit, only globals.c #includes INTERN.h
David Mitchell [Wed, 26 Dec 2018 20:37:45 +0000 (20:37 +0000)]
ext/SDBM_File/sdbm.c: don't include INTERN.h
This file really only needs including by globals.c - including it in
sdbm.c was probably just a thinko or cut and paste error from decades ago.
Removing it doesn't seem to break anything.
After this commit, only globals.c and regcomp.c include it.
David Mitchell [Wed, 26 Dec 2018 10:41:28 +0000 (10:41 +0000)]
vutil.c: build under PERL_GLOBAL_STRUCT_PRIVATE
The perl build option -DPERL_GLOBAL_STRUCT_PRIVATE had bit-rotted
due to lack of smoking. This commit and the next fix it.
I've separated out the vutil.c change into a separate commit since
this file is actually part of the 'version' CPAN distribution and
normally should be edited upstream first.
David Mitchell [Wed, 26 Dec 2018 10:45:22 +0000 (10:45 +0000)]
add dVAR's for PERL_GLOBAL_STRUCT_PRIVATE builds
The perl build option -DPERL_GLOBAL_STRUCT_PRIVATE had bit-rotted
due to lack of smoking. The main fix is to just add 'dVAR;' to any
functions which have a pTHX arg. It's a NOOP on normal builds.
David Mitchell [Tue, 19 Feb 2019 09:12:33 +0000 (09:12 +0000)]
re/user_prop_race_thr.t: reduce timeout
This new test script has a test that's supposed to exercise an up-to 10s
wait-and-retry loop when loading properties. It has a 500s timeout
built-in for if that fails. On my system its been intermittently
failing (not sure if due to something I'm doing or a problem with the
test or with regcomp.c) which effectively hangs the test run.
So decrease the timeout to 25 secs.
Nicolas R [Mon, 18 Feb 2019 23:42:44 +0000 (16:42 -0700)]
Update Time-HiRes Changes for 1.9760
1.9760 is now released to CPAN to match its status
in blead.
This commit is synchronizing the Changelog, by reintroducing
some history which were lost during previous reverts.
Any new change since cf8375d should now go to the next release 1.9761.
A '{{NEXT}}' entry was added to the Changes for tracking these changes.
Note that a Dual-Life git repository is now available for Time-HiRes.
Upstream-URL: https://github.com/Dual-Life/Time-HiRes
Sawyer X [Mon, 18 Feb 2019 07:36:46 +0000 (09:36 +0200)]
Update releaser managers
Aristotle Pagaltzis [Mon, 18 Feb 2019 07:06:30 +0000 (08:06 +0100)]
deprecate: bump $VERSION to 0.04
Aristotle Pagaltzis [Sun, 17 Feb 2019 23:14:52 +0000 (00:14 +0100)]
deprecate: expand the documentation
Aristotle Pagaltzis [Sun, 17 Feb 2019 23:14:46 +0000 (00:14 +0100)]
prepare next patch
Aristotle Pagaltzis [Sun, 17 Feb 2019 23:14:38 +0000 (00:14 +0100)]
deprecate: fix POD heading level
Karl Williamson [Sun, 17 Feb 2019 03:02:57 +0000 (20:02 -0700)]
mktables: Omit unnecessary duplicates
These are in a generated structure.
Karl Williamson [Sat, 16 Feb 2019 19:14:27 +0000 (12:14 -0700)]
perldelta: perldelta for previous commit
Karl Williamson [Sat, 16 Feb 2019 18:44:56 +0000 (11:44 -0700)]
malloc.c: Limit malloc size to PTRDIFF_MAX
Without doing this, it is possible that the behavior is undefined when
subtracting two pointers that point to the same object.
See thread beginning at
http://nntp.perl.org/group/perl.perl5.porters/251541
In particular this from Tomasz Konojacki
C11 says:
> When two pointers are subtracted, both shall point to elements of the
> same array object, or one past the last element of the array object;
> the result is the difference of the subscripts of the two array
> elements. The size of the result is implementation-defined, and its
> type (a signed integer type) is ptrdiff_t defined in the <stddef.h>
> header. If the result is not representable in an object of that type,
> the behavior is undefined.
There are many ways to interpret this passage, but according to (most?)
C compilers developers, it means that no object can be larger than
PTRDIFF_MAX. For example, gcc's optimizer assummes that strlen() will
never return anything larger than PTRDIFF_MAX [1].
There's also a blogpost[2] on this topic, which IMO is a very
interesting read.
If gcc and clang can assume that all objects won't be larger than
PTRDIFF_MAX, so can we. Also, in practice, ssize_t and ptrdiff_t on most
(all?) platforms are defined as exactly the same type.
BTW, the fact that compilers assume that objects can't be larger than
PTRDIFF_MAX has very dangerous implications on 32-bit platforms. Is it
possible to create string longer than PTRDIFF_MAX on 32-bit perls?. It
shouldn't be allowed.
[1] - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78153
[2] - https://trust-in-soft.com/objects-larger-than-ptrdiff_max-bytes/
Karl Williamson [Sat, 16 Feb 2019 18:29:51 +0000 (11:29 -0700)]
regcomp.c: Don't interate a loop needlessly
While single stepping in gdb, I noticed that this loop kept executing,
when it need not.
Karl Williamson [Sat, 16 Feb 2019 18:29:01 +0000 (11:29 -0700)]
perldelta for previous commi
Karl Williamson [Sat, 16 Feb 2019 18:11:59 +0000 (11:11 -0700)]
PATCH: [perl #133770] null pointer dereference in S_regclass()
The failing case can be reduced to
qr/\x{100}[\x{3030}\x{1fb2}/
(It only happens on UTF-8 patterns).
The bottom line is that it was assuming that there was at least one
character that folded to 1fb2 besides itself, even though the function
call said there weren't any such. The solution is to pay attention to
the function return value.
I incorporated Hugo's++ patch as part of this one.
However, the original test case should never have gotten this far. The
parser is getting passed garbage, and instead of croaking, it is somehow
interpreting it as valid and calling the regex compiler. I will file a
ticket about that.
Karl Williamson [Sat, 16 Feb 2019 16:50:33 +0000 (09:50 -0700)]
PATCH: [perl #133767] Assertion failure
The problem here is that a syntax error occurs and hence certain things
don't get done, but processing continues, as the error isn't checked for
until after the return of the function that found it. The failing
assertion is checking that those certain things actually did get done.
There appear to be good reasons to defer the raising of the error until
then, so the simplest way to fix this is to generalize the code so that
the failing assertion doesn't happen.
James E Keenan [Fri, 15 Feb 2019 12:47:47 +0000 (07:47 -0500)]
Jakub Wilk is now a Perl author.
Jakub Wilk [Thu, 14 Feb 2019 17:11:56 +0000 (18:11 +0100)]
perlthrtut: Fix POD formatting
Dagfinn Ilmari Mannsåker [Fri, 15 Feb 2019 11:15:03 +0000 (11:15 +0000)]
Use STATIC_ASSERT_STMT for checking compile-time invariants
Better to have the build fail if they're wrong than relying on the
code path being hit at runtime in a DEBUGGING build.
Karl Williamson [Fri, 15 Feb 2019 05:14:12 +0000 (22:14 -0700)]
Merge branch 'incore' into blead
This branch moves the handling of user-defined \p{} properties from
lib/utf8_heavy.pl into regcomp.c (rewriting it in C). This fixes a
bunch of bugs, and removes all uses of swashes from regular expression
compilation and execution.
Karl Williamson [Wed, 22 Aug 2018 04:27:19 +0000 (22:27 -0600)]
Remove relics of regex swash use
This removes the most obvious and easy things that are no longer needed
since regexes no longer use swashes at all.
tr/// continues, for the time being, to use swashes, so not all swash
handling is removable now. But tr/// doesn't use inversion lists, and
so a bunch of code is ripped out here. Other code could have been, but
I did only the relatively easy stuff. The rest can be ripped out all at
once when tr/// is stops using swashes.
Karl Williamson [Thu, 14 Feb 2019 19:34:49 +0000 (12:34 -0700)]
Use mnemonics for array indices
The element at say, [0] is a particular thing. This commit changes to
use a mnemonic instead of [0], for clarity
Karl Williamson [Thu, 23 Aug 2018 19:54:48 +0000 (13:54 -0600)]
regcomp.c: Arrays no longer need PL_sv_undef placeholders
An empty entry is now just NULL.
Karl Williamson [Wed, 22 Aug 2018 02:12:00 +0000 (20:12 -0600)]
regcomp.c: Simplify args passing for ANYOF nodes
A swash is no longer used, so we can remove some elements from the array
of data that gets stored with the compiled pattern for use in runtime
matching. This is the first step in more simplifications.
Since a swash isn't used, this change also requires regexec.c to change
to use a straight inversion list lookup. This has the salutary effect
of eliminating a conversion between code point and UTF-8.
Karl Williamson [Thu, 14 Feb 2019 19:16:13 +0000 (12:16 -0700)]
Add .t for testing user-defined \p{} races
Karl Williamson [Mon, 6 Aug 2018 23:00:40 +0000 (17:00 -0600)]
t/re/regexp_unicode_prop.t: Make sure sub called only once
User-defined properties are supposed to be called just once for /i and
once for non-/i. This adds tests for that.
It turns out that this was broken in blead.
Karl Williamson [Fri, 24 Aug 2018 18:34:18 +0000 (12:34 -0600)]
t/re/regexp_unicode_prop.t: Add tests
Add some tests. These test various error conditions that haven't been
tested before.
Karl Williamson [Wed, 15 Aug 2018 23:11:15 +0000 (17:11 -0600)]
t/re/regexp_unicode_prop.t: Test that can have nested pkgs
That is, in \p{user-defined}
Karl Williamson [Wed, 15 Aug 2018 23:09:45 +0000 (17:09 -0600)]
t/re/regexp_unicode_prop.t: Add some stress
This adds some trailing spaces and comments in expansion of
\p{user-defined}/ to verify things work.
Karl Williamson [Wed, 15 Aug 2018 23:07:51 +0000 (17:07 -0600)]
t/op/taint.t: Add test
Karl Williamson [Thu, 23 Aug 2018 20:05:29 +0000 (14:05 -0600)]
regcomp.c: Add some potential code that's #ifdef'd out
This is in case we ever need it. This checks for portability in the
code points specified in user-defined properties. Previously there was
a check, but I couldn't get a warning to trigger unless there was also
overflow. So that means the pattern compile failed due to the overflow,
and the portability warning was superfluous. But, one can have
non-portable code points without overflow; just the old method didn't
properly detect them. If we do ever need to detect and report on them,
the code is mostly written and in this commit.
Karl Williamson [Tue, 21 Aug 2018 00:31:04 +0000 (18:31 -0600)]
Move \p{user-defined} to core from utf8_heavy.pl
This large commit moves the handling of user-defined properties to C
code. This should speed it up, but the main reason to do this is to
stop using swashes in this case, leaving only tr/// using them. Once
that too is converted, all swash handling can be ripped out of perl.
Doing this in perl has caused some nasty interactions that will now be
fixed automatically.
The change is not entirely transparent, however (besides speed and the
possibility of removing these interactions). perldelta in this commit
details these.
Karl Williamson [Wed, 15 Aug 2018 22:11:04 +0000 (16:11 -0600)]
Add global hash to handle \p{user-defined}
A global hash has to be specially handled. The keys can't be shared,
and all the SVs stored into it must be in its thread. This commit adds
the hash, and initialization, and macros for context change, but doesn't
use them. The code to deal with this is entirely confined to regcomp.c.
Karl Williamson [Wed, 15 Aug 2018 21:45:14 +0000 (15:45 -0600)]
Add mutex for dealing with qr/\p{user-defined}/
This will be used in future commits
Karl Williamson [Mon, 6 Aug 2018 23:39:35 +0000 (17:39 -0600)]
regcomp.c: Add/reword some comments/white-space
Karl Williamson [Fri, 3 Aug 2018 20:12:49 +0000 (14:12 -0600)]
regcomp.c: Change variable name
The new name more closely corresponds with its use.
Nicolas R [Thu, 14 Feb 2019 21:44:24 +0000 (14:44 -0700)]
perldelta prep setup for v5.29.8
This is a preparation commit for the future
Perl v5.29.8 release.
perldelta still contains some placeholder which
would need to be cleanup before release and would
also need to take into account any changes since
commit
5eabe055
Several sections have already been removed considering
they would not be used, feel free to restore them if required.
Nicolas R [Thu, 14 Feb 2019 19:12:41 +0000 (13:12 -0600)]
perldelta module changes from ext,lib
remove useless module sections for 5.29.8
Nicolas R [Thu, 14 Feb 2019 18:45:20 +0000 (12:45 -0600)]
Update Modules section for perldelta for 5.29.8
List cpan & dist packages updated since v5.29.7
and add a warning for the JSON::PP incompatible changes.
Nicolas R [Thu, 14 Feb 2019 16:30:34 +0000 (10:30 -0600)]
Update JSON-PP to CPAN version 4.00
[DELTA]
4.00 2018-12-07
- production release
3.99_01 2018-12-03
- BACKWARD INCOMPATIBILITY:
As JSON::XS 4.0 changed its policy and enabled allow_nonref
by default, JSON::PP also enabled allow_nonref by default
- implement allow_tags that was introduced by JSON::XS 3.0
- add boolean_values that was introduced by JSON::XS 4.0
- allow literal tags in strings in relaxed mode, as JSON::XS 3.02 does
- allow PERL_JSON_PP_USE_B environmental variable to restore
old number detection behavior for compatibility
- various doc updates
Nicolas R [Thu, 14 Feb 2019 16:15:10 +0000 (10:15 -0600)]
t/porting/manifest.t add line number
Improve t/porting/manifest.t output on errors
to show the line number.
Nicolas R [Thu, 14 Feb 2019 02:22:10 +0000 (20:22 -0600)]
Net::Ping 501_ping_icmpv6.t: disable sudo test
This is similar to the changes made in
7bfdd8260c
we do not want to use 'sudo' during the tests.
Nicolas R [Thu, 14 Feb 2019 00:32:46 +0000 (18:32 -0600)]
Update Net::Ping to upstream version 2.71
This retains blead customizations:
*
1a58b39af8 remove of 'use vars'
*
7bfdd8260c 500_ping_icmp.t: remove sudo code
These changes are not required anymore, they
are merged upstream
*
0fc44d0a18 avoid stderr noise in tests
Chris 'BinGOs' Williams [Thu, 14 Feb 2019 13:17:57 +0000 (13:17 +0000)]
Update Test-Simple to CPAN version 1.302162
[DELTA]
1.302162 2019-02-05 19:55:14-08:00 America/Los_Angeles
- Typo fixes in documentation
Nicolas R [Wed, 13 Feb 2019 23:58:31 +0000 (17:58 -0600)]
Update Module-Load to CPAN version 0.34
[DELTA]
0.34 Sun Feb 10 13:56:54 GMT 2019
* Added SEE ALSO section to documentation. RT#100575
* Unreachable code cleanup (https://github.com/jib/cpanplus-devel/pull/15)
Karl Williamson [Wed, 13 Feb 2019 17:02:13 +0000 (10:02 -0700)]
perlrecharclass: Note many fewer xdigits than digts
This adds a note explaining why there are only two sets of hex digits
Karl Williamson [Wed, 13 Feb 2019 16:33:56 +0000 (09:33 -0700)]
perlrecharclass: Rmv obsolete RFC
The deleted text asked for comments on a proposal that never went
anywhere.
Karl Williamson [Wed, 13 Feb 2019 16:30:29 +0000 (09:30 -0700)]
perlrecharclass: Clarify
See http://blogs.perl.org/users/tom_wyant/2019/01/untrusted-numeric-input.html
Tony Cook [Wed, 6 Feb 2019 04:42:10 +0000 (15:42 +1100)]
(perl #133660) add test for goto &sub in overload leaking
The bug in this case was fixed in
db9848c8d.
Andreas Koenig [Sun, 10 Feb 2019 16:20:57 +0000 (16:20 +0000)]
perlsyn.pod: correct typo in doc
Karl Williamson [Wed, 6 Feb 2019 18:54:14 +0000 (11:54 -0700)]
makedef.pl: Fix to work with -DNO_LOCALE config opt
We shouldn't export non-existent variables
Karl Williamson [Wed, 6 Feb 2019 18:53:10 +0000 (11:53 -0700)]
locale.c: Fix compilation error
This code would fail to require if Configure had ccflags=-DNO_LOCALE
Karl Williamson [Wed, 6 Feb 2019 18:51:05 +0000 (11:51 -0700)]
t/loc_tools.pl: C.UTF-8 is a likely locale
When looking for locales on a system, try this one which seems to be
getting to be available widely.
Karl Williamson [Wed, 6 Feb 2019 18:49:25 +0000 (11:49 -0700)]
ext/POSIX: Fix compilation error
This code is not usually compiled, but if tried, it would fail. It
needed a cast.
Karl Williamson [Tue, 5 Feb 2019 18:45:10 +0000 (11:45 -0700)]
Merge branch 'turkic locale handling' into blead
This series of commmits adds seamdless handling of UTF-8 Turkic locales
to blead. Unicode furnishes an alternate set of casing rules for these
locales, which until now were ignored by Perl. These commits causes
Perl to use the alternate rules when it detects that the UTF-8 locale it
is using is in fact a specialized Turkic one
Karl Williamson [Tue, 5 Feb 2019 18:30:05 +0000 (11:30 -0700)]
Docs for new Turkic UTF-8 locale support
Karl Williamson [Tue, 5 Feb 2019 01:58:26 +0000 (18:58 -0700)]
locale.c: Add detection of Turkic UTF-8 locales
When switching into a new locale, after it is decided this is a UTF-8
locale, the code now also checks for if the locale is a specialized
Turkic one, which has a couple of slightly modified casing change rules.
If so, it sets a flag indicating this.
The code that has been added in previous commits in this series check if
that flag is set when they are actually paying attention to the
background locale, and if so behave according to Unicode Turkic rules.
Karl Williamson [Tue, 5 Feb 2019 01:30:25 +0000 (18:30 -0700)]
regcomp.c: White-space only
Indent a block of code newly formed by the previous commit
Karl Williamson [Tue, 5 Feb 2019 00:46:20 +0000 (17:46 -0700)]
Add Turkish locale handling to /i pattern matching
Previous commits in this series have changed uc(), lc(), fc(), etc. to
know how to handle Turkish UTF-8 locales. This commit extends this to
/i regular expression pattern matching.
Karl Williamson [Mon, 4 Feb 2019 23:18:51 +0000 (16:18 -0700)]
pp.c: White-space only
Indent block newly formed by the previous commit.
Karl Williamson [Mon, 4 Feb 2019 23:12:58 +0000 (16:12 -0700)]
pp.c: Add handling for Turkish locales for uc() etc
The functions lc() uc() ucfirst() lcfirst() and fc() are hereby expanded
to handle the differences required in Turkish locales.
No Turkish locales are recognized until later in this series of
commits.
Karl Williamson [Mon, 4 Feb 2019 23:03:49 +0000 (16:03 -0700)]
t/op/lc.t: Add tests for Turkish locales
But since these aren't recognized yet, they will be skipped
Karl Williamson [Mon, 4 Feb 2019 22:29:55 +0000 (15:29 -0700)]
Add .t to test Turkic locale folding
This just calls fold_grind.pl with a particular option.
But, as of this commit, Turkish locales aren't recognized specially, so
this test just always skips.
Karl Williamson [Mon, 4 Feb 2019 22:23:31 +0000 (15:23 -0700)]
t/re/fold_grind.pl: Enhance to deal with Turkic rules
The CaseFolding.txt file has special locale-dependent rules. This
commit changed fold_grind to notice them, and to generate tests for
the situation we aren't in, which are expected to fail.
Since, as of this commit, the Turkic locale is not recognized, this
commit has the effect of generating tests for the Turkic locale, running
them, and making sure they fail when appropriate.
Karl Williamson [Mon, 4 Feb 2019 21:18:18 +0000 (14:18 -0700)]
t/loc_tools.pl: Add functions to find Turkic UTF-8 locales
These will be used by later commits. But right now Perl doesn't know
how to determine if a locale is Turkic, so these functions return no
locale, until later in this commit series
Karl Williamson [Mon, 4 Feb 2019 21:07:11 +0000 (14:07 -0700)]
utf8.c: Add functions for Turkic locale case changing
These override the normal handling of UTF-8 locale case changing.
They aren't actually called yet, until later in this series of commits.
Karl Williamson [Mon, 4 Feb 2019 21:11:08 +0000 (14:11 -0700)]
Add variable for if the current UTF-8 locale is Turkic
It currently is always set false, until later in this series of commits.
Karl Williamson [Mon, 4 Feb 2019 16:18:24 +0000 (09:18 -0700)]
regcomp.c: Under /l any < 256 char can match any other
The code knew this, but it was adding the ASCII alphabetics to the list
of things that matched in UTF-8 locales. This is unnecessary, as we've
long had the infrastructure elsewhere to handle all potential mappings
from a Latin1 code point to other Latin1, so we can just rely on it.
And it created complexities for future commits in this series.
The MICRO SIGN is the exception, as it folds to non-Latin1 in UTF-8
locales, and this is the place where the structure exists to handle
that.
Karl Williamson [Sun, 3 Feb 2019 17:03:18 +0000 (10:03 -0700)]
regen/regcharclass.pl: Remove obsolete macro
This has been replaced by regen/unicode_constants.pl some releases ago.
Karl Williamson [Fri, 1 Feb 2019 18:22:15 +0000 (11:22 -0700)]
regen/mk_invlists.pl: Create new inversion list
This will be used in a future commit.
Karl Williamson [Mon, 21 Jan 2019 16:46:00 +0000 (09:46 -0700)]
mktables: Make Turkic 'I' chars problematic
In a Turkic locale, these are problematic because their mappings
cross the 255/256 boundary.
This change has the side effect of causing U+307 to be added to the
problematic list, and it normally really isn't problematic, because in
those locales where U+130 and U+131 are problematic, U+307 isn't used.
But applications could switch in and out of Turkic locales, so it's best
to leave it be considered problematic. The consequences of making this
mark problematic are simply slightly less optimized regex pattern code.
Karl Williamson [Tue, 5 Feb 2019 05:09:31 +0000 (22:09 -0700)]
sv_utf8_upgrade_flags_grow(): Alloc extra byte if empty
People may call this expecting that the 'extra' parameter is on top of
whatever is in there. If something is in there, that already includes a
NUL, but if nothing is in there, for safety, add a byte to the request.
Karl Williamson [Tue, 5 Feb 2019 01:27:38 +0000 (18:27 -0700)]
regcomp.c: Clarify comment
Karl Williamson [Tue, 5 Feb 2019 05:13:16 +0000 (22:13 -0700)]
pp.c: Clarify comment
David Mitchell [Tue, 5 Feb 2019 14:04:32 +0000 (14:04 +0000)]
[MERGE] various overload fixups
This branch contains several commits which simplify the code concerning
the processing of a value returned by an overload method, and
specifically whether that value should be returned as-is by the op, or
assigned to the targ / stack value: $lex = x op y) and (x op= y)
respectively.
The final commit fixes a bug in pp_multiconcat. That op bypasses most of
the code in those earlier commits and "rolls it's own", and which was
getting the set/assign decision wrong in some cases, causing a leak.
David Mitchell [Tue, 5 Feb 2019 13:48:21 +0000 (13:48 +0000)]
Avoid leak in multiconcat with overloading.
RT #133789
In the path taken through pp_multiconcat() when one or more args have
side-effects such tieing or overloading, multiconcat has to decide
whether to just return the result of all the concatting as-is, or to
first assign it to an expression or variable if the op includes an
implicit assign (such as $lex = x.y.z or $a[0] = x.y.z).
The code was getting this right for those two cases, and was also
getting it right for the append cases ($lex .= x.y.z and $a[0] .= x.y.z),
which don't need assigns. But for the bare case (x.y.z) it was assigning
to the op's targ as well as returning the value. Hence leaking a
reference until destruction of the sub and its pad.
This commit stops the assign in that last case.
David Mitchell [Mon, 4 Feb 2019 15:17:02 +0000 (15:17 +0000)]
Perl_try_amagic_un/bin re-indent
After the previous commit's simplification, eliminate a set of braces and
re-indent a block of code.
David Mitchell [Mon, 4 Feb 2019 15:07:11 +0000 (15:07 +0000)]
Eliminate AMGf_set flag
I added this flag a few years ago when I revamped the overload macros
tryAMAGICbin() etc. It allowed two different classes of macros to
share the same functions (Perl_try_amagic_un/Perl_try_amagic_bin)
by indicating what type of action is required.
However, the last few commits have made those two functions able to
robustly always determine whether its an assign-type action
($x op= $y or $lex = $x op $x) or a plain set-result-on-stack operation
($x op $y).
So eliminate this flag.
Note that this makes the ops which have the AMGf_set flag hard-coded
infinitesimally slower, since Perl_try_amagic_bin no longer skips the
checks for assign-ness. But compared with the overhead of having
already called the overload method, this is is trivial.
On the plus side, it makes the code smaller and easier to understand.
David Mitchell [Mon, 4 Feb 2019 14:52:01 +0000 (14:52 +0000)]
Perl_try_amagic_bin(): eliminate dATARGET
.. and replace with explicit tests and assigns to targ.
This macro includes an OPf_STACKED test which has already been done
above. Also, by protecting the OPf_STACKED test within a AMGf_assign
test, we can eliminate the AMGf_set flag in the next commit, and use the
same set of code for both AMGf_set and AMGf_assign variant calls to
Perl_try_amagic_bin().
David Mitchell [Mon, 4 Feb 2019 14:11:13 +0000 (14:11 +0000)]
Eliminate SvPADMY tests from overload code
A couple of places in the overload code do SvPADMY(TARG) to decide
whether this is a normal op like ($x op $y), where the targ will have
SVs_PADTMP set, or a lexical assignment like $lex = ($x op $y) where the
assign has been optimised away and the op is expected to directly assign
to the targ which it thinks is a PADTMP but is really $lex.
Since the SVs_PADMY flag was eliminated a while ago, SvPADMY() is just
defined as !(SvFLAGS(sv) & SVs_PADTMP). Thus the overload code is
relying on the absence of a PADTMP flag in the target to deduce that the
OPpTARGET_MY optimisation is in effect. This seems to work (at least for
the code in the test suite), but can't be regarded as robust. This
commit removes each SvPADMY() test and replaces it with the twin
if ( (PL_opargs[PL_op->op_type] & OA_TARGLEX)
&& (PL_op->op_private & OPpTARGET_MY))
tests.
David Mitchell [Mon, 4 Feb 2019 13:48:13 +0000 (13:48 +0000)]
Eliminate opASSIGN macro usage from core
This macro is defined as
(PL_op->op_flags & OPf_STACKED)
and indicates, for ops which support it, that the mutator-variant of the
op is present (e.g. $x += 1).
This macro was mainly used as an arg for the old-style overloading
macros (tryAMAGICbin()) which were eliminated several years ago.
This commit removes its vestigial usage, and instead tests OPf_STACKED
directly at each location, along with adding a comment about the
significance of the flag.
This removes one item of obfuscation from the overloading code.
There is one potentially functional change in this commit:
Perl_try_amagic_bin() was sometimes testing for OPf_STACKED without
first checking that it had been called with the AMGf_assign flag (which
indicates that this op supports a mutator variant). With this commit, it
now checks first, so this is theoretically a bug fix. In practice that
section of code was never reached without AMGf_assign always being set
anyway.
Tony Cook [Tue, 5 Feb 2019 08:40:53 +0000 (19:40 +1100)]
(perl #133824) fix threading builds
I failed to test the build with threads.
Karl Williamson [Fri, 1 Feb 2019 15:48:20 +0000 (08:48 -0700)]
regen/unicode_constants.pl: generate UTF-8 for U+307
This will be needed in a future commit
Karl Williamson [Fri, 1 Feb 2019 15:29:51 +0000 (08:29 -0700)]
t/loc_tools.pl: Add fcn to return all UTF-8 locales
This will be needed in future commits
Karl Williamson [Fri, 1 Feb 2019 18:45:34 +0000 (11:45 -0700)]
pp.c: White-space only
Indent block newly formed in the previous commit
Karl Williamson [Fri, 1 Feb 2019 18:43:10 +0000 (11:43 -0700)]
pp.c: Avoid use of unsafe function
The function is unsafe because it doesn't check for running off the end
of the buffer if presented with illegal UTF-8. The only remaining use
now is from mathoms.c.
Karl Williamson [Fri, 1 Feb 2019 18:41:14 +0000 (11:41 -0700)]
pp.c: Add branch prediction hint
This conditional is very rarely true
Karl Williamson [Wed, 30 Jan 2019 18:24:12 +0000 (11:24 -0700)]
pp.c: Don't assume worst case memory needs
Since 5.28, there has been a function that will calculate the expansion
of a string when converted into UTF-8, using per-word operations. This
means it runs 8 times faster than doing this count previously would have
taken.
I've come to believe it is better to calculate how much memory we need
than to overallocate based on worst-case scenarios. This is because in
very large strings, over allocating can lead to unnecessary inefficient
processing.
This commit changes several instances in pp.c where a string needs to be
converted to UTF-8 to not assume the worst case, but instead calculate
what's needed using the faster function.
Karl Williamson [Wed, 30 Jan 2019 18:09:01 +0000 (11:09 -0700)]
pp.c: Don't use function call for easy copy
Like the previous commit, this code is adding the UTF-8 for a Greek
character to a string. It previously used Copy, but this character is
representable as two bytes in both ASCII and EBCDIC UTF-8, the only
character sets that Perl will ever supports, so we can use the
specialized code that is used most everywhere else for two byte UTF-8
characters, avoiding the function overhead, and having to treat this
character as particularly special.
Karl Williamson [Wed, 30 Jan 2019 17:52:41 +0000 (10:52 -0700)]
pp.c: Don't use function call for easy copy
This code is adding the UTF-8 for a Greek character to a string. It
previously used Copy, but this character is representable as two bytes
in both ASCII and EBCDIC UTF-8, the only character sets that Perl will
ever supports, so we can use the specialized code that is used most
everywhere else for two byte UTF-8 characters, avoiding the function
overhead, and having to treat this character as particularly special.
Karl Williamson [Wed, 30 Jan 2019 17:35:21 +0000 (10:35 -0700)]
pp.c: pp_fc(): Simplify
The function being called does everything that the code being eliminated
here did. We just pass the function the final destination instead of a
temporary.
Karl Williamson [Wed, 30 Jan 2019 17:27:17 +0000 (10:27 -0700)]
pp.c: White-space, comments only
Karl Williamson [Wed, 30 Jan 2019 17:02:35 +0000 (10:02 -0700)]
pp.c: Reorder clause order in an 'if'
This makes the test most likely to fail be first, and adding an
UNLIKELY() to it, thus saving a conditional in most instances.
Karl Williamson [Wed, 30 Jan 2019 05:02:59 +0000 (22:02 -0700)]
pp.c: Use faster method to convert to UTF-8
There is a special inline function that's used when converting a single
byte to UTF-8, that is faster than the more general one used prior to
this commit.
Karl Williamson [Wed, 30 Jan 2019 05:01:18 +0000 (22:01 -0700)]
pp.c: Add missing assert
The comments say there is an assert, but it wasn't there.
Karl Williamson [Mon, 4 Feb 2019 23:02:35 +0000 (16:02 -0700)]
t/op/lc.t: Add 'use strict'