This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
David Mitchell [Wed, 10 Feb 2016 10:53:03 +0000 (10:53 +0000)]
regen_perly.pl: improve action extracting
The regex was sometimes missing final cases from the big
action switch.
This simplifies the regex, but assumes that 'default: break;' is the last
case. This is the case in bison 2.7 and 3.0.2.
David Mitchell [Wed, 10 Feb 2016 10:21:08 +0000 (10:21 +0000)]
regen_perly.pl: print command with -v
when run verbose, print the bison command that is run
H.Merijn Brand [Thu, 11 Feb 2016 07:33:40 +0000 (08:33 +0100)]
Updated outdated link to smoke reports for HP-UX
Karl Williamson [Wed, 10 Feb 2016 22:05:45 +0000 (15:05 -0700)]
regcomp.c: Clarify error message
It is an error to specify an empty Unicode property name, like in
qr/\p{}/. It also is illegal to just say qr/\p/. Prior to this commit
the error message for that latter construct misleadingly referred to
braces. Since there are no braces in the input, they shouldn't be
mentioned.
Karl Williamson [Wed, 10 Feb 2016 21:25:31 +0000 (14:25 -0700)]
t/re/regex_sets.t: Add some tests
Karl Williamson [Wed, 10 Feb 2016 21:21:24 +0000 (14:21 -0700)]
sv.c: Handle radix being multi-byte and not UTF-8
While reviewing this code, I realized that the decimal point could
legally be a sequence of characters, not just a single one. I don't
know of any cases of that happening, but it's easy to handle that
possibility.
Karl Williamson [Wed, 10 Feb 2016 21:13:10 +0000 (14:13 -0700)]
regexec.c: Skip duplicate work
By changing the fallthrough of a case of a switch to a goto we can avoid
re-executing the test that was just done.
Karl Williamson [Wed, 10 Feb 2016 18:28:58 +0000 (11:28 -0700)]
regcomp.c: Replace invalid assertion
A future commit shows that this assertion is not valid. I don't know
how it can currently be triggered, but fix the code to properly handle
the case.
Karl Williamson [Wed, 10 Feb 2016 18:22:21 +0000 (11:22 -0700)]
regcomp.c: Avoid a function call in a common case
When the regex pattern is in UTF-8, we can avoid calling the function to
convert it to a code point for the common case where it is the same in
UTF-8 as not, i.e. if the character is ASCII on ASCII platforms (and
additionally any control on EBCDIC)
Karl Williamson [Wed, 10 Feb 2016 18:04:44 +0000 (11:04 -0700)]
regcomp.c: Add some grouping parens
&& has lower precedence than &, but it's better to be clear.
Karl Williamson [Wed, 10 Feb 2016 17:54:42 +0000 (10:54 -0700)]
utf8.h: Guard some macros against improper calls
The UTF8_IS_foo() macros have an inconsistent API. In some, the
parameter is a pointer, and in others it is a byte. In the former case,
a call of the wrong type will not compile, as it will try to dereference
a non-ptr. This commit makes the other ones not compile when called
wrongly, by using the technique shown by Lukas Mai (in
9c903d5937fa3682f21b2aece7f6011b6fcb2750) of ORing the argument with a
constant 0, which should get optimized out.
Karl Williamson [Wed, 10 Feb 2016 23:27:07 +0000 (16:27 -0700)]
regcomp.c, regexec.c: Comments, white-space only
Karl Williamson [Wed, 10 Feb 2016 23:27:13 +0000 (16:27 -0700)]
regcomp.c: Fix some parsing glitches
I undertook a code review of how regcomp.c parses things in light of the
tickets found by the fuzzer,
https://rt.perl.org/Ticket/Display.html?id=126546. This commit is the
result of my efforts so far. I was not planning to push it now, but the
work found a couple of new error messages that should be raised, and
this has to be done before the visible changes code freeze coming up all
too soon. I will add test cases after that freeze, including if to see
that these changes fix all the observed issues.
The audit was tedious, and may have missed some things. Several issues
occurred in multiple places. One is to not advance the parse by
UTF8SKIP appropriately; another is to subtract one byte from the parse
and assume that that is pointing to the beginning of the previous
character (which under UTF-8 it may not). Another is to assume that
that the pattern is a C string, that there are no interior NULs in it.
I also found unnecessary tests, given that an SV always has a
terminating NUL.
Karl Williamson [Sun, 27 Dec 2015 17:39:02 +0000 (10:39 -0700)]
regcomp.c: Extract duped code into one fcn
This takes code that was duplicated and makes it into a single static
inline function, so that maintenance tasks don't have to be done on both
copies.
Karl Williamson [Thu, 11 Feb 2016 03:46:39 +0000 (20:46 -0700)]
porting/diag.t: Handle some E<> pod escapes
These can occur in perldiag, and so must be converted into the character
that the internal message outputs. This commit causes the major ones of
these to be converted.
Karl Williamson [Thu, 11 Feb 2016 03:31:13 +0000 (20:31 -0700)]
podcheck.t: Need to translate E<lt> and E<gt>
These can appear in links, and need to be translated into their correct
character.
Ricardo Signes [Thu, 11 Feb 2016 02:56:35 +0000 (21:56 -0500)]
release schedule: September 2016 is scheduled
Daniel Dragan [Wed, 10 Feb 2016 20:47:44 +0000 (15:47 -0500)]
add shortcut around syscalls when file not found in win32_stat
win32_stat on success makes ~7 system calls, some from perl, some from CRT,
but on failure, typically file not found, the perl syscalls fails, then the
CRT stat runs, and fails too, so 5 mostly failing system calls are done
for file not found. If the perl syscall says file not found, the
file wont magically come into existence in the next 10-1000 us for the
CRT's syscalls, so skip calling the CRT and the additional syscalls
if the perl didn't find the file. This patch reduces the number of syscalls
from 5 to 1 for file not found for a win32 perl stat. Benchmark and
profiling info is attached to RT ticket for this patch. Note CreateFile on
a dir fails with ERROR_ACCESS_DENIED so in some cases, a failed CreateFile
is still a successful CRT stat() which does things differently so dirs can
be opened.
Ricardo Signes [Wed, 10 Feb 2016 16:06:50 +0000 (11:06 -0500)]
update release schedule for beginnings of 5.25
Karl Williamson [Tue, 9 Feb 2016 18:50:04 +0000 (11:50 -0700)]
PATCH: [perl #8904] Revamp [:posix:] parsing
A problem with bracketed character classes, qr/[foo]/, is that there is
very little structure about them, so almost anything is legal, and so
typos just silently compile into something unintended. One of the
possible components are posix character classes. There are 14 of them,
and they have a very restricted structure, which is easy to get slightly
wrong, so that instead of the intended posix class being compiled,
something else silently is created. This commit causes the regex
compiler to look for slightly misspelled posix character classes and to
raise a warning when found. It does not change the results of the
compilation.
To do this, it introduces fuzzy parsing into the regex compiler, using
the Damerau-Levenshtein algorithm to find out how many single character
edits it would take to transform the input into one of the 14 classes.
If it is 1 or 2 off, it considers the input to have been intended to be
that class and raises the warning. If more edits would be needed, it
remains silent.
This is a heuristic, and someone could have made enough typos that this
thinks a class wasn't intended that was. Conversely it could raise a
warning when no class was intended, though warnings only happen when the
input very closely resembles a posix class of one of the 14 legal ones.
The algorithm can be tweaked if experience indicates it should. But the
bottom line is that many more cases of unintended results will now be
warned about.
Things like having blanks in the construct and having the '^' before the
colon are recognized as being intended posix classes (given that the
actual names are close to one of the 14), and raise warnings. Again
this commit does not change what gets compiled. This found a bug in
autodoc.pl which was fixed a few commits ago.
The [. .] and [= =] POSIX constructs cause perl to croak that they are
unimplemented. This commit improves the parsing of these two, and fixes
some false positives. See
http://nntp.perl.org/group/perl.perl5.porters/230975
The new code combines two functions in regcomp.c into one new one.
Karl Williamson [Tue, 9 Feb 2016 21:00:23 +0000 (14:00 -0700)]
regcomp.c: Fix recursive parsing bug
In certain cases, regex compilation will use a substitute input string
when parsing what it thinks is a bracketed character class /[ ... ] /.
The substitute automatically had a ']' appended to it, even if the
original didn't have one, leading to wrong results.
I did not add a test for this, as the next commit causes current tests
to fail if this one isn't done.
Karl Williamson [Tue, 9 Feb 2016 18:28:32 +0000 (11:28 -0700)]
regcomp.c: White-space, variable name-change only
This changes indents, and the names of two variables in preparation for
the next commit, so the difference listing won't be so large for that.
Karl Williamson [Tue, 9 Feb 2016 18:00:58 +0000 (11:00 -0700)]
autodoc.pl: Fix misspelled /[[:alpha:]]/
This typo was caught by the work for a couple of commits down the road.
Karl Williamson [Tue, 9 Feb 2016 17:49:29 +0000 (10:49 -0700)]
Add Nick Logan to AUTHORS
The previous commit used code written by him.
Karl Williamson [Tue, 9 Feb 2016 17:40:38 +0000 (10:40 -0700)]
regcomp.c: Add code to compute edit distance (Damerau–Levenshtein)
This will be used in a future commit.
This code is taken from CPAN Text::Levenshtein::Damerau::XS with the
author's knowledge. There have been white-space changes to make it
conform better to perl's core coding standards, and declaration changes
to make it more portable, such as using UV instead of 'unsigned int',
and PERL_STATIC_INLINE instead of a less portable form, but the logic is
unchanged. One variable was changed to signed from unsigned to avoid a
warning message from some compilers.
The author and I will decide later about keeping the cpan module and
this code in sync. It changes very rarely.
Tony Cook [Wed, 10 Feb 2016 05:07:42 +0000 (16:07 +1100)]
perldelta for
1bb1a3d6d35
Tony Cook [Wed, 10 Feb 2016 05:03:22 +0000 (16:03 +1100)]
[perl #127334] S_incline: avoid overrunning the end of the parse buffer
If the rest of the allocation up to the end addressable memory was
non-spaces, this loop could cause a segmentation fault.
Avoid that by ensuring we stop when we see a NUL.
Tony Cook [Wed, 10 Feb 2016 03:35:53 +0000 (14:35 +1100)]
Tony Cook [Wed, 10 Feb 2016 03:30:08 +0000 (14:30 +1100)]
[perl #127494] don't cache AUTOLOAD as DESTROY
Otherwise S_curse() would need to do all the work gv_autoload_pvn()
already does to set up to call AUTOLOAD() (setting $AUTOLOAD etc.)
Instead, by not caching it, we ensure gv_autoload_pvn() is called
each time to perform the required setup.
This has a performance cost over adding that code to S_curse(), but the
cost of actually running the AUTOLOAD sub is likely to drown that out,
and is easily avoided by adding "sub DESTROY {}" to the module.
Tony Cook [Wed, 10 Feb 2016 00:46:48 +0000 (11:46 +1100)]
[perl #127494] TODO test for $AUTOLOAD being set for DESTROY
000814da allowed the cached DESTROY method to be an AUTOLOAD method,
but didn't ensure that $AUTOLOAD (or the equivalent for XS AUTOLOADS)
was set when AUTOLOAD was called.
Add a TODO test for this behaviour.
James E Keenan [Sun, 7 Feb 2016 12:58:29 +0000 (07:58 -0500)]
Update guidance on naming of modules.
Delete reference to comp.lang.perl.misc. Add references to module-authors
list/newsgroup and to PAUSE.
For: RT # 127435
Sawyer X [Tue, 9 Feb 2016 18:57:58 +0000 (19:57 +0100)]
Remove outdated task in release:
I checked with Graham Barr, who said the list of PAUSE accounts
that can upload perl distributions is automated and taken from:
http://pause.perl.org/pause/query?ACTION=who_pumpkin;OF=YAML
This means that if you're already on the list, you do not need
to check again on search.cpan.org or to bug Graham. :)
Tom Hukins [Tue, 9 Feb 2016 11:15:53 +0000 (11:15 +0000)]
Time::HiRes moved from "cpan" to "dist" in 91ba54
Karl Williamson [Mon, 1 Feb 2016 18:08:29 +0000 (11:08 -0700)]
locale.c: Improve -DL debug info
This changes the debug info to include if the LC_NUMERIC decimal point
(radix) character string is UTF-8 encoded or not, and it uses the actual
value stored in Perl for that string instead of the POSIX origin of that
data, thus being more accurate should they ever get out-of-sync
Ricardo Signes [Mon, 8 Feb 2016 21:20:58 +0000 (16:20 -0500)]
move Time-HiRes from cpan to dist
Tony Cook [Mon, 8 Feb 2016 04:20:40 +0000 (15:20 +1100)]
Tony Cook [Tue, 19 Jan 2016 00:42:21 +0000 (11:42 +1100)]
[perl #124387] call AUTOLOAD when DESTROY isn't defined
Tony Cook [Tue, 19 Jan 2016 00:39:48 +0000 (11:39 +1100)]
[perl #124387] TODO test for AUTOLOAD on DESTROY
Tony Cook [Mon, 18 Jan 2016 06:42:32 +0000 (17:42 +1100)]
[perl #126410] keep the DESTROY cache in mro_meta
We're already keeping destroy_gen there, so keep the CV there too.
The previous implementation, introduced in
8c34e50d, kept the
destroy method cache in the stash's stash, which broke B's SvSTASH
method.
Before that, the DESTROY method was cached in overload magic.
A previous version of this patch didn't clear the destructor cache on
a clone, which caused ext/XS-APItest/t/clone_with_stack.t to fail.
Todd Rinaldo [Mon, 18 Jan 2016 05:30:37 +0000 (16:30 +1100)]
Document broken SvSTASH for %version:: in B's test suite
RT 126410: This may not be a B bug but we have no test coverage for SvSTASH at
the moment. TODO the test until it is working correctly.
TonyC: fix syntax error and update MANIFEST
Tony Cook [Sun, 7 Feb 2016 23:16:50 +0000 (10:16 +1100)]
perldelta for
071db91b12fc
Tony Cook [Sun, 7 Feb 2016 23:02:48 +0000 (10:02 +1100)]
add Pip Cet to AUTHORS
Pip Cet [Sun, 7 Feb 2016 23:01:06 +0000 (10:01 +1100)]
[perl #127474] fix operator precedence when (castflags & 2)
Jarkko Hietaniemi [Fri, 5 Feb 2016 17:00:06 +0000 (12:00 -0500)]
Storable version bump.
Jarkko Hietaniemi [Thu, 4 Feb 2016 01:35:49 +0000 (20:35 -0500)]
POSIX version bump.
Jarkko Hietaniemi [Fri, 5 Feb 2016 21:37:58 +0000 (16:37 -0500)]
POSIX: strcmp NE strEQ().
CID 135006: Constant expression result (CONSTANT_EXPRESSION_RESULT)
The Coverity 'detail' is priceless:
always_true_or: The "or" condition strcmp(strings->name, "decimal_point") || strcmp(strings->name, "thousands_sep") || strcmp(strings->name, "grouping") will always be true because strings->name cannot be equal to two different values at the same time, so it must be not equal to at least one of them.
Jarkko Hietaniemi [Thu, 4 Feb 2016 02:40:34 +0000 (21:40 -0500)]
POSIX: Check fds against negatives.
Not directly an open Coverity issue (though previously we have had
similar ones) but inspired by the similar change for signal numbers.
Jarkko Hietaniemi [Thu, 4 Feb 2016 01:06:38 +0000 (20:06 -0500)]
POSIX: Check signal numbers against negatives.
CID 135020: Argument cannot be negative (NEGATIVE_RETURNS)
CID 135021: Argument cannot be negative (NEGATIVE_RETURNS)
sigismember()
sigaddset()
sigdelset()
Note that sigaction() already has its own handling for the signal number.
Jarkko Hietaniemi [Thu, 4 Feb 2016 00:40:21 +0000 (19:40 -0500)]
Storable: Own ASSERT or no, we want to assert(prev).
Coverity CID 135012: Explicit null dereferenced (FORWARD_NULL)
The prev can be set to NULL (zero) just few lines earlier.
Jarkko Hietaniemi [Thu, 4 Feb 2016 02:42:36 +0000 (21:42 -0500)]
XS-APItest: Length cannot be negative.
Coverity CID 135019: Argument cannot be negative (NEGATIVE_RETURNS)
Jarkko Hietaniemi [Wed, 3 Feb 2016 15:53:23 +0000 (10:53 -0500)]
ODBM_File: Avoid TOCTOU and using negative returns.
Coverity CID 135022: Argument cannot be negative (NEGATIVE_RETURNS)
Coverity CID 135027: Time of check time of use (TOCTOU)
Replace use of stat()-guarded use of creat() (wow) with open(...O_EXCL...)
(when O_CREAT) so that there is no race condition (TOCTOU) window
between the stat() check for non-existence (which can fail also for
other reasons) and the two (sic) creat() calls.
Similarly, without O_CREAT, use open(...O_RDONLY...) instead of the stat().
Possible problem: arguably, systems old enough to be still using
ODBM_File (or requiring creat()) might not have the O_EXCL.
Jarkko Hietaniemi [Sun, 7 Feb 2016 00:59:09 +0000 (19:59 -0500)]
Add missing break in switch.
Coverity CID 135145: Missing break in switch (MISSING_BREAK)
Jarkko Hietaniemi [Sun, 7 Feb 2016 00:55:53 +0000 (19:55 -0500)]
Add missing break in switch.
Coverity CID 28986: Missing break in switch (MISSING_BREAK)
Jarkko Hietaniemi [Sun, 7 Feb 2016 00:50:33 +0000 (19:50 -0500)]
assert(PL_parser)
Coverity CID 135144: Dereference after null check (FORWARD_NULL)
Earlier, pp.c:8254-ish, PL_parser is tested against NULL, so it
presumably can be NULL.
Jarkko Hietaniemi [Sun, 7 Feb 2016 00:28:00 +0000 (19:28 -0500)]
Assert no bad array access.
Coverity CID 135147: Out-of-bounds access (OVERRUN)
Long-distance trouble: regexec.c:8922-ish calls (if DEBUGGING) the
regprop() in regcomp.c, which can access the five-element bounds[]
array with the flags value as the offset. However, Coverity thinks
it sees that in regexec.c the flags value may be up to nine.
Jarkko Hietaniemi [Sun, 7 Feb 2016 01:21:03 +0000 (20:21 -0500)]
Do not try to fchown() to uid -1 and gid -1.
Jarkko Hietaniemi [Sun, 7 Feb 2016 00:05:10 +0000 (19:05 -0500)]
Check against negative uid/gid for fchown().
Coverity CID 135145: Argument cannot be negative (NEGATIVE_RETURNS)
Jarkko Hietaniemi [Sat, 6 Feb 2016 23:54:29 +0000 (18:54 -0500)]
assert(cv) before doing CvROOT(cv)
Coverity CID 29020 (an old one from 2014, wondering why it now resurfaced)
Jarkko Hietaniemi [Wed, 3 Feb 2016 21:33:20 +0000 (16:33 -0500)]
Check for invlist_search() returning negative array indices.
Coverity CID 135014: Negative array index read (NEGATIVE_RETURNS)
Coverity CID 135015: Negative array index read (NEGATIVE_RETURNS)
Multiple cases, all the getLB_...() uses in regexec.c.
Address this by under DEBUGGING rerouting the invlist_search()
result through a static helper function which does the sanity
checking against negatives and then returns the result.
Jarkko Hietaniemi [Wed, 3 Feb 2016 14:53:54 +0000 (09:53 -0500)]
If not using smallbuf and len > sizeof(d_name), Move() is illegal.
Coverity CID 135024: Out-of-bounds access (OVERRUN)
If the len is not <= sizeof(smallbuf), the len is at least
sizeof(smallbuf) + 1, which means at least 257. Now, if the
sizeof(dirent->d_name) is < 257, the Move() would access bytes
beyond the end of d_name[]. Yes, this would need for the d_namlen
(for example) to be out of sync with d_name[]. But paranoia is good.
Because of the severity of the problem (indicating serious mess),
returning NULL instead of partial result is probably better.
Possible portability problem: can d_name ever be not an array,
but instead a bare char pointer? If that can happen, the sizeof(d_name)
is wrong, and in that case we have to have some other way of figuring
out the maximum size for a directory entry name.
The smallbuf probably could/should be of size MAXPATHLEN.
Jarkko Hietaniemi [Fri, 5 Feb 2016 23:48:02 +0000 (18:48 -0500)]
Whitespace only: zap empty lines.
Jarkko Hietaniemi [Fri, 5 Feb 2016 21:06:53 +0000 (16:06 -0500)]
Lexical scoping in case statement is tricky.
Coverity CID 135142: Structurally dead code (UNREACHABLE)
The case labels are just effectively goto labels, and therefore
any variable initialization will not happen. That is not the case
luckily here, the variables will be always overwritten as needed.
But better not to introduce false lexical scopes to avoid future
misery.
In the general case, the only way to have lexically tighter scopes is
to have dedicated blocks for each case, but that doesn't easily work here,
with all the tricky jumping.
We could switch() the second time on CxTYPE(), and have these variables
scoped on an inner block, but since this is hot hot hot code, better
not to mess with that, and just hoist the variables to an outer scope.
Any deeper refactoring should be done with profilers at hand.
Jarkko Hietaniemi [Wed, 3 Feb 2016 13:11:14 +0000 (08:11 -0500)]
assert() that itersvp is non-NULL.
Coverity CID 135011 Explicit null derefenced
In pp_iter() there are multiple derefers of *itersvp, but at the
setting of itersvp the CxITERVAR() can return NULL, add an assert()
to catch the badness in debug builds (as the Coverity builds are).
David Mitchell [Sat, 6 Feb 2016 11:04:45 +0000 (11:04 +0000)]
pp_enter: calculate gimme earlier in XS branch
My commit
801bbf618dc make it so that pp_entersub only calculates
gimme at the point its needed, to avoid wasting register resource.
However in n the XS branch it was a bit over-enthusiatic: its possible
for an XS sub to save PL_op and change its value. The old value will
only get restored when pp_entersub soes LEAVE, which is *after* we
cacluate gimme. So grab the value before the XS sub is called.
Craig A. Berry [Sat, 6 Feb 2016 02:29:20 +0000 (20:29 -0600)]
perldelta: recent %ENV changes on VMS.
Craig A. Berry [Sat, 6 Feb 2016 00:26:06 +0000 (18:26 -0600)]
perldelta for podlators update.
Craig A. Berry [Fri, 5 Feb 2016 19:24:20 +0000 (13:24 -0600)]
Integrate podlators 4.06.
Jarkko Hietaniemi [Fri, 5 Feb 2016 16:35:20 +0000 (11:35 -0500)]
cmpVERSION STDERR messages for test failures.
Better known as t/porting/cmp_version
Craig A. Berry [Thu, 4 Feb 2016 22:26:05 +0000 (16:26 -0600)]
Do environ key case consistently on VMS.
For those %ENV elements based on the CRTL environ array, we've
always preserved case when setting them but done look-ups only
after upcasing the key first, which makes lower- or mixed-case
entries go missing.
So make them consistently case-preserved and in the docs
distinguish this behavior from the case-blind behavior of keys
for %ENV entries based on logical namees and DCL symbols, which
remains unchanged.
Jarkko Hietaniemi [Thu, 4 Feb 2016 14:54:53 +0000 (09:54 -0500)]
Cast away Solaris Studio 12.3 warning.
"pp.c", line 3220: warning: initializer will be sign-extended: -
2147483648
Jarkko Hietaniemi [Thu, 4 Feb 2016 12:38:56 +0000 (07:38 -0500)]
OpenBSD does not do si_uid with sigaction().
Seen in OpenBSD 4.8, but found no mention of this working in 5.x.
Tony Cook [Thu, 4 Feb 2016 06:12:23 +0000 (17:12 +1100)]
perldelta the fix for [perl #126621]
which was in
2e2d7405a2b751f778ee3118a87a5f31233efc77
Jarkko Hietaniemi [Thu, 4 Feb 2016 01:18:23 +0000 (20:18 -0500)]
POSIX version bump.
Jarkko Hietaniemi [Thu, 4 Feb 2016 01:16:39 +0000 (20:16 -0500)]
We're against contractions.
More importantly: that apostrophe messed up my Emacs' syntax highlighting.
Jarkko Hietaniemi [Thu, 4 Feb 2016 01:15:21 +0000 (20:15 -0500)]
Oddly placed unused decls for fma() and the gamma funcs.
The fma() also missing not_here().
Karl Williamson [Wed, 3 Feb 2016 23:06:13 +0000 (16:06 -0700)]
perldelta, perlguts: Fix typos
Karl Williamson [Wed, 3 Feb 2016 19:03:09 +0000 (12:03 -0700)]
perlapi: Clarify that a literal string must end in a NUL
Some entries already had this. For those, it standardizes the text.
Karl Williamson [Wed, 3 Feb 2016 17:29:17 +0000 (10:29 -0700)]
podcheck.t: regen db for overlong verbatim lines
It seems to me best to keep these new overlong verbatim lines in
perlandroid.pod (introduced in
2000dc211bab554f6ffba69e510ae90f85f8c931)
Karl Williamson [Wed, 3 Feb 2016 17:27:53 +0000 (10:27 -0700)]
perlguts: Make verbatim lines fit in 79 cols
Karl Williamson [Wed, 3 Feb 2016 17:27:14 +0000 (10:27 -0700)]
perldelta: Make verbatim line fit in 79 columns
Karl Williamson [Fri, 29 Jan 2016 04:31:36 +0000 (21:31 -0700)]
re/uniprops: Fix EBCDIC issue
Things like qr/\s/ are expecting native code points, not EBCDIC.
Karl Williamson [Fri, 22 Jan 2016 18:04:51 +0000 (11:04 -0700)]
regexec.c: Refactor \b{sb} handling
This rule, unlike the other Unicode boundary ones, does not really lend
itself to a decision pair table, because most of the decisions require
more context than just the current pair. However, when I wrote the code
this replaces, I couldn't see the forest for the trees. It turns out
that the needed context is the same or almost the same for many of the
rules, so only has to be found once, as opposed to the old rules, where
it could need to be re-found several times.
Karl Williamson [Fri, 22 Jan 2016 18:01:16 +0000 (11:01 -0700)]
regexec.c: Fix comment, white-space
Karl Williamson [Wed, 20 Jan 2016 23:15:38 +0000 (16:15 -0700)]
Use table lookup for qr/\b{wb}/
This follows the recent commits for lb and gcb, and generates a table at
regen time for Word Breaking. The result may run faster, depending on
the compiler optimization capabilities, than before, and is easier to
maintain, as it's easier to smack a new rule into the regen perl script
than it is to change the C code.
David Mitchell [Tue, 26 Jan 2016 15:14:50 +0000 (15:14 +0000)]
Add support for bison 3.0
Mainly it no longer generates some tables used for debugging.
This commit also adds a new define showing what bison version was used.
David Mitchell [Wed, 3 Feb 2016 13:01:27 +0000 (13:01 +0000)]
add perldelta entry for context stack work
Jarkko Hietaniemi [Wed, 3 Feb 2016 02:28:32 +0000 (21:28 -0500)]
Additional hexfp %a tests inspired by
c95ea682.
If both plus and space are specified, the space is ignored.
David Mitchell [Wed, 3 Feb 2016 09:35:46 +0000 (09:35 +0000)]
[MERGE] revamp context system
David Mitchell [Mon, 18 Jan 2016 13:23:14 +0000 (13:23 +0000)]
remove dSP from a couple of pp_enter* fns
These functions don't modify the args stack, so there's no need
to dSP; ...; PUTBACK.
Also write a negated bit test condition in pp_enterwhen() a bit less
clumsily.
David Mitchell [Mon, 18 Jan 2016 13:09:35 +0000 (13:09 +0000)]
leave_adjust_stacks() fix some code comments
One comment was obsolete; the other referred to the wrong pp function
David Mitchell [Mon, 18 Jan 2016 12:31:24 +0000 (12:31 +0000)]
leave_adjust_stacks(): avoid accessing random tmps
There was some code in leave_adjust_stacks() that checked whether the
current arg sv being processed was the same SV as the first SvTEMP
above the 'cut' on the tmps stack. If there was nothing above the cut,
it was actually comparing against whatever garbage was 1 slot above the
current PL_tmps_ix. This was almost always harmless (but of course wrong);
the only symptom was an occasional smoke failure in t/re/pat_re_eval_thr.t,
due to this:
local our $s = "abc";
my $qr = qr/^(?{1})$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s$s/;
where a qr// with a code blocks acts like
my $qr = sub : lvalue { .....; }->()
to make closures happen correctly. The lvalue return from the anon sub was
triggering this because the address of $s was in one of the unused slots
above PL_tmp_ix.
I couldn't get it to fail in a simple test case.
At the same time, I moved a SvREFCNT_inc() inside a check for
!SvIMMORTAL(sv) since there's no need to do it for PL_sv_undef etc.
David Mitchell [Mon, 4 Jan 2016 09:16:52 +0000 (09:16 +0000)]
make gimme consistently U8
The value of gimme stored in the context stack is U8.
Make all other uses in the main core consistent with this.
My primary motivation on this was that the new function cx_pushblock(),
which I gave a 'U8 gimme' parameter, was generating warnings where callers
were passing I32 gimme vars to it. Rather than play whack-a-mole, it
seemed simpler to just uniformly use U8 everywhere.
Porting/bench.pl shows a consistent reduction of about 2 instructions on
the loop and sub benchmarks, so this change isn't harming performance.
David Mitchell [Sun, 3 Jan 2016 19:38:13 +0000 (19:38 +0000)]
fix -DPERL_GLOBAL_STRUCT_PRIVATE
Perl_leave_adjust_stacks() needed a dVAR
David Mitchell [Sun, 3 Jan 2016 17:28:06 +0000 (17:28 +0000)]
perlfunc: say what block types 'return' recognises
'return' recognises these blocks
sort { ...; return } ...
/(?{ ...; return })/;
but ignores these:
grep { ...; return } ...;
map { ...; return } ...;
David Mitchell [Sun, 3 Jan 2016 15:23:56 +0000 (15:23 +0000)]
perlguts: add section on context stack
David Mitchell [Sun, 3 Jan 2016 15:17:40 +0000 (15:17 +0000)]
fix cx_dup for CXt_LOOP_PLAIN
The context stack duplication code tries to duplicate the loop var
even for CXt_LOOP_PLAIN, which doesn't have a loop var. This didn't
use to matter, since PUSHLOOP_PLAIN() used to set the field to NULL;
for efficiency its now left untouched. So don't try to use it.
Also update the debugging context names since the ordering of the
CXt_LOOP_* has changed recently.
David Mitchell [Thu, 31 Dec 2015 10:39:17 +0000 (10:39 +0000)]
MULTICALL *shouldn't* clear savestack
About 25 commits ago in this branch I added a commit:
MULTICALL should clear scope after each call
To fix RT #116577, which reported that lexicals were only being freed
at the end of the MULTICALL, not after each individual call to the sub.
In that commit, I added a LEAVE_SCOPE() to the end of the MULTICALL()
definition. However, after further thought I realise that's wrong. If a
multicall sub does something like { my $x = $_*2; $x }, then the returned
value would be freed before the XS code which calls MULTICALL() has a
chance to do anything with it (e.g. test for truth, or add it to the return
args or whatever).
So I think popping the save stack should be the responsibility of the
caller of MULTICALL(), rather than of MULTICALL() itself.
David Mitchell [Wed, 30 Dec 2015 15:48:52 +0000 (15:48 +0000)]
add blk_old_tmpsfloor shortcut
Add
#define blk_old_tmpsfloor cx_u.cx_blk.blku_old_tmpsfloor
to match all the other 'struct block' fields which have similar short cuts
David Mitchell [Wed, 30 Dec 2015 15:20:41 +0000 (15:20 +0000)]
dMULTICALL: remove unused vars
dMULTICALL declares several vars that are used either to maintain
state across multiple calls, or to pass values to PUSHSUB etc, where
those macros expected to obtain some of their args by values being
implicitly passed via local vars. Since PUSHSUB has been replaced by
cx_pushsub() which now has all parameters explicitly passed, there is
no longer any need for those vars. So this commit eliminates them:
newsp
hasargs
There are also a couple vars which are no longer used due to changes to
the implementation over time; these can also be eliminated:
cx multicall_cv
Finally, this branch introduced a new var, saveix_floor; rename it to
multicall_saveix_floor for consistency with other dMULTICALL vars.
Although none of these vars are listed in the documentation, its possible
that some code out there may rely on them in some way, and will need to be
fixed up.
David Mitchell [Wed, 30 Dec 2015 14:33:51 +0000 (14:33 +0000)]
convert CX_{PUSH|POP}{WHEN|GIVEN} to inline fns
Replace CX_PUSHGIVEN() with cx_pushgiven() etc.