This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Karl Williamson [Tue, 6 Jan 2015 20:07:51 +0000 (13:07 -0700)]
PATCH: [perl #123539] regcomp.c node overrun/segfault
This is a minimal patch suitable for a maintenance release. It extracts
the guts of reguni and REGC without the conditional they have. The next
commit will do some refactoring to avoid branching and to make things
clearer.
This bug is due to the current two pass structure of the Perl regular
expression compiler. The first pass attempts to do just enough work to
figure out how much space to malloc for the compiled pattern; and the
2nd pass actually fills in the details. One problem with this design is
that in many cases quite a bit of work is required to figure out the
size, and this work is thrown away and redone in the second pass.
Another problem is that it is easy to forget to do enough work in the
sizing pass, and that is what happened with the blamed commit. I
understand that there are plans (God speed) to change the compiler
design.
When not under /i matching, the size of a node that will match a
sequence of characters is just the number of bytes those characters take
up. We have an easy way to calculate the number of bytes any code point
will occupy in UTF-8, and it's just 1 byte per code point for non-UTF-8.
So in the sizing pass, we don't actually have to figure out the
representation of the characters. However under /i matching, we do.
First of all, matching of UTF-8 strings is done by replacing each
character of each string by its fold-case (function fc()) and then
comparing. This is required by the nature of full Unicode matching
which is not 1-1. If we do that replacement for the pattern at compile
time, we avoid having to do it over-and-over as pattern matching
backtracks at execution. And because fc(x) may not occupy the same
number of bytes as x, and there is no easy way to know that size without
actually doing the fc(), we have to do the fold in the sizing pass.
Now, there are relatively few folds where sizeof(fc(x)) != sizeof(x), so
we could construct an exception table for those few cases where it is,
and look up through that.
But there is another reason that we have to fold in the sizing pass.
And that is because of the potential for multi-character folds being
split across regnodes. The regular expression compiler generates
EXACTish regnodes for matching sequences of characters exactly or via
/i. The limit for how many bytes in a sequence such a node can match is
255 because the length is stored in a U8. If the pattern has a sequence
longer than that, it is split into two or more EXACTish nodes in a row.
(Actually, the compiler splits at a size much lower than that; I'm not
sure why, but then two adjoining nodes whose total sum length is at most
255 get joined later in the third, optimizing pass.) Now consider,
matching the character U+FB03 LATIN SMALL LIGATURE FFI. It matches the
sequence of the three characters "f f i". Because of the design of the
regex pattern matching code, if these characters are such that the first
one or two are at the end of one EXACTish node, and the final two or one
are in another EXACTish node, then U+FB03 wrongly would not match them.
Matches can't cross node boundaries. If the pattern were tweaked so all
three characters were in either the first or second node, then the match
would succeed. And that is what the compiler does. When it reaches the
node's size limit, and the final character is one that is a non-terminal
character in a multi-char fold, what's in the node is backed-off until
it ends with a character without this characteristic. This has to be
done in the sizing pass, as we are repacking the nodes, which can affect
the size of the pattern, and we have to know what the folds are in order
to determine all this.
(We don't fold non-UTF-8 patterns. This is for two reasons. One is
that one character, the U+00B5 MICRO SIGN, folds to above-Latin1, and if
we folded it, we would have to change the pattern into UTF-8, and that
would slow everything down. I've thought about adding a regnode type
for the much more common case of a sequence that doesn't have this
character in it, and which could hence be folded at compile time. But
I've not been able to justify this because of the 2nd reason, which is
folds in this range are simple enough to be handled by an array lookup,
so folding is fast at runtime.)
Then there is the complication of matching under locale rules. This bug
manifested itself only under /l matching. We can't fold at pattern
compile time, because the folding rules won't be known until runtime.
This isn't a problem for non-UTF-8 locales, as all folds are 1-1, and so
there never will be a multi-char fold. But there could be such folds in
a UTF-8 locale, so the regnodes have to be packed to work for that
eventuality. The blamed commit did not do that, and because this issue
doesn't arise unless there is a string long enough to trigger the
problem, this wasn't found until now. What is needed, and what this
commit does, is for the unfolded characters to be accumulated in both
passes. The code that looks for potential multi-char fold issues
handles both folded and unfolded-inputs, so will work.
Steve Hay [Tue, 6 Jan 2015 18:13:26 +0000 (18:13 +0000)]
minitest needs config.h to build $(MINIPERL)
Since
f7219c0a9696421192a4830631fa6e3fd28adf39 didn't work out, just add
the config.h dependency directly to minitest instead (which was my main
reason for doing this anyway).
We can also drop the utils dependency from minitest since it doesn't seem
to be required other than to get git_version.h/Config_git.pl made along
the way, but we can add that dependency directly instead. In this way,
minitest still works, but now without building the full perl.exe and
perl5XX.dll as well (which was previously happening as a consequence of
the utils dependency).
Steve Hay [Tue, 6 Jan 2015 16:06:46 +0000 (16:06 +0000)]
Revert commit
f7219c0a9696421192a4830631fa6e3fd28adf39
That broke the (full) build. Sorry for only minitest-ing before pushing :-/
Father Chrysostomos [Tue, 6 Jan 2015 05:58:53 +0000 (21:58 -0800)]
pad.c: Remove unused context params
Daniel Dragan [Sun, 4 Jan 2015 22:49:09 +0000 (17:49 -0500)]
refactor gv_add_by_type
gv_add_by_type was added in commit
d5713896ec in 5.11.0 . Improve
gv_add_by_type by making it return the newly created SV*, instead of the
the GV *, which the caller must deref both the GV head to get svu and
then deref a slice into the GP, even though it already derefed svu and GP
right before, to figure out whether to call gv_add_by_type in the first
place. The original version of this patch had gv_add_by_type returning a
SV ** to ensure lvalue-ness but it was discovered it wasn't needed and not
smart.
-rename gv_add_by_type since it was removed from public api and its proto
changed
-remove null check since it is impossible to pass null through GvAVn(),
and unlikely with gv_AVadd, null segvs reliably crash in the rare case of
a problem
-instead of S_gv_init_svtype and gv_add_by_type using a tree of logic/
conditional jumps in asm, use a lookup table, GPe (e=enum or entry)
enums are identical to offsets into the GP struct, all of then fit under
0xFF, if the CC and CPU arch wants, CC can load the const once into a
register, then use the number for the 2nd deref, then use the number again
as an arg to gv_add_by_type, the low (&~0xf) or high (<<2) 2 bits in a
GPe can be used for something else in the future since GPe is pointer
aligned
-SVt_LAST triggers "panic: sv_upgrade to unknown type", so use that value
for entries of a GP which are not SV head *s and are invalid to pass as
an arg
-remove the tree of logic in S_gv_init_svtype, replace with a table
-S_gv_init_svtype is now tail call friendly and very small
-change the GV**n to be rvalues only, assigning to GV**n is probably a
memory leak
-fix 1 core GV**n as lvalue use
-GvSVn's unusual former definition is from commit
547f15c3f9 in 2005
and DEFSV as lvalue is gone in core as of commit
414bf5ae08 from 2008
since all the GV**n macros are now rvalues, this goes too
-PTRPTR2IDX and PTRSIZELOG2 could use better names
-in pp_rv2av dont declare strings like that VC linker won't dedup that, and
other parts of core also have "an ARRAY", perl521.dll previously had 2
"an ARRAY" and "a HASH" strings in it due to this
before VC 2003 32 perl521.dll .text 0xc8813 in machine code bytes after
.text 0xc8623
Father Chrysostomos [Tue, 6 Jan 2015 05:41:26 +0000 (21:41 -0800)]
perly.y: Don’t call op_lvalue on refgen kid
ck_spair also applies lvalue context to the kid ops, so we just end up
calling op_lvalue twice on the same ops. It’s harmless (being idempo-
tent), but wasteful.
Father Chrysostomos [Tue, 6 Jan 2015 05:32:03 +0000 (21:32 -0800)]
Revert "const + static vtables in threads::shared"
This reverts commit
7105b7e7a5e49caa06b8d7ef71008838ec902227.
Steve Hay [Tue, 6 Jan 2015 13:10:25 +0000 (13:10 +0000)]
miniperl needs config.h to build
Ricardo Signes [Tue, 6 Jan 2015 12:10:13 +0000 (07:10 -0500)]
perlpolicy: add a missing verb
Steve Hay [Tue, 6 Jan 2015 09:22:08 +0000 (09:22 +0000)]
Can't put a comment in the middle of a command broken across lines
Actually, you can in the nmake makefile (win32/Makefile), which is surely
how it slipped through in commit
4ea01bd3b4, but you can't do it in the
dmake makefile (win32/makefile.mk). Change it in both for consistency.
Father Chrysostomos [Tue, 6 Jan 2015 04:31:46 +0000 (20:31 -0800)]
Increase perl5db.pl’s $VERSION to 1.47
Father Chrysostomos [Tue, 6 Jan 2015 02:07:38 +0000 (18:07 -0800)]
Update address for E. Choroba
E. Choroba [Tue, 6 Jan 2015 02:05:17 +0000 (18:05 -0800)]
perl5db.pl: Undefined subroutine &DB::db_warn
Perl debugger sometimes tries to call a non-existent subroutine
db_warn. It's real name is _db_warn, though.
How to replicate:
perl -d -e1
!/
Output:
/bin/bash: /: Is a directory
Undefined subroutine &DB::db_warn called at /usr/lib/perl5/5.18.1/perl5db.pl line 6740.
at /usr/lib/perl5/5.18.1/perl5db.pl line 6740.
DB::_db_system('/bin/bash', '-c', '/') called at /usr/lib/perl5/5.18.1/perl5db.pl line 3923
DB::Obj::_handle_sh_command('DB::Obj=HASH(0x105df48)') called at /usr/lib/perl5/5.18.1/perl5db.pl line 2992
DB::DB called at -e line 1
Ricardo Signes [Tue, 6 Jan 2015 02:31:49 +0000 (21:31 -0500)]
Merge branch 'perlpolicy' into blead
Ricardo Signes [Fri, 19 Dec 2014 02:25:27 +0000 (21:25 -0500)]
perlpolicy: clarify what "feature can be replaced" means
Ricardo Signes [Fri, 19 Dec 2014 02:21:58 +0000 (21:21 -0500)]
perlpolicy: the point is caution, not low stakes
Ricardo Signes [Wed, 17 Dec 2014 00:33:00 +0000 (19:33 -0500)]
forward reference from "new features" to "experimental"
Ricardo Signes [Wed, 17 Dec 2014 00:32:47 +0000 (19:32 -0500)]
proposed changes for perlpolicy updates
see http://nntp.perl.org/group/perl.perl5.porters/219866
Chad Granum [Mon, 5 Jan 2015 16:30:00 +0000 (08:30 -0800)]
Test-Simple Version Bump, 1.301001_097 (RC17)
Daniel Dragan [Mon, 5 Jan 2015 06:44:59 +0000 (01:44 -0500)]
const + static vtables in threads::shared
This makes threads::shared have no non-NULL initialized RW static data.
Uninitialized and NULL filled RW data like PL_sharedsv_space and
prev_signal_hook remain, but on some OSes/CCs (Win32 with special tweaks),
this means that now the RW data section in threads::shared shared library
has no disk representation. Static the remaining RW vars to trim the
symbol table on non-Win32.
Daniel Dragan [Mon, 5 Jan 2015 06:27:11 +0000 (01:27 -0500)]
const a PERLIO vtable in PerlIO::encoding
This makes PerlIO::encoding's shared library free of any perl caused RW
static data.
Tony Cook [Tue, 6 Jan 2015 00:24:53 +0000 (11:24 +1100)]
make minitest (mostly) work on Win32
The only test left failing is op/glob.t, since I couldn't find the
cause of the failure
Tony Cook [Tue, 9 Dec 2014 04:13:04 +0000 (15:13 +1100)]
minitest: run the sames tests on win32 as on POSIXish systems
Tony Cook [Tue, 9 Dec 2014 04:06:05 +0000 (15:06 +1100)]
minitest: miniperl on win32 always displays the x86 arch, so skip testing it
Tony Cook [Tue, 9 Dec 2014 03:38:38 +0000 (14:38 +1100)]
minitest op/magic.t: skip the env_is() tests on Win32 miniperl
Since
1070c8d6 env_is() requires the Win32 module on Win32, which
miniperl can't load
Tony Cook [Mon, 8 Dec 2014 04:11:29 +0000 (15:11 +1100)]
Win32 minitest: -k is never available on Win32
Given the structure of the tests and the code, I can only assume
S_ISVTX is only unavailable on Win32 out of the systems we run
minitest on.
Tony Cook [Mon, 8 Dec 2014 02:27:35 +0000 (13:27 +1100)]
miniperl on Win32 doesn't have fork()
Tony Cook [Mon, 8 Dec 2014 01:56:07 +0000 (12:56 +1100)]
some socket functions aren't available under Win32 miniperl
This only skips for bind, connect, accept() and select().
Other functions are tested in coreamp.t, but either aren't called with
valid handles, so don't reach the "unimplemented" errors, or don't
trigger the errors for other reasons.
fixes: op/coreamp.t, op/sselect.t, op/tie_fetch_count.t
Tony Cook [Mon, 8 Dec 2014 00:44:52 +0000 (11:44 +1100)]
disable Win32 sloppy stat in io/fs.t, io/stat.t
8ce7a7e8b08f added a line to write_buildcustomize.pl to enable
${^WIN32_SLOPPY_STAT} in lib/buildcustomize.pl on Win32.
This meant the nlink value from stat wasn't being populated correctly
causing the link count tests to fail
Tony Cook [Tue, 16 Dec 2014 06:05:18 +0000 (17:05 +1100)]
use textmode when opening scripts in miniperl to match perl
fixes io/data.t
This could be considered a bug in io/data.t, since it writes the scripts
in text mode, but making miniperl behave closer to perl may fix
other issues too.
Tony Cook [Tue, 16 Dec 2014 06:04:09 +0000 (17:04 +1100)]
build miniperl with PerlIO
Several tests use PerlIO layers (:utf8, :pop) without testing for it.
non-PerlIO builds were vaguely deprecated in 5.18.0 and can no longer be
enabled on POSIX systems through Configure, so making miniperl PerlIO
on Win32 is no big stretch
minitests failing now:
io/data.t
io/fs.t
op/coreamp.t
op/filetest.t
op/fork.t
op/glob.t
op/heredoc.t
op/magic.t
op/sselect.t
op/stat.t
op/tie_fetch_count.t
Tony Cook [Mon, 8 Dec 2014 00:01:34 +0000 (11:01 +1100)]
t/TEST: glob the supplied filenames on Win32
since Win32 perl doesn't glob by default
at this point the following tests fail:
io/dup.t
io/fs.t
io/open.t
io/perlio_leaks.t
op/coreamp.t
op/filetest.t
op/fork.t
op/fresh_perl_utf8.t
op/glob.t
op/heredoc.t
op/magic.t
op/read.t
op/readline.t
op/sselect.t
op/stat.t
op/substr.t
op/tie_fetch_count.t
op/write.t
Tony Cook [Sun, 7 Dec 2014 23:48:32 +0000 (10:48 +1100)]
use TEST for minitest, same as POSIX systems
Test::Harness now requires IO at all times, which means it can't
be used with miniperl
many tests fail with minitest at this point
Father Chrysostomos [Mon, 5 Jan 2015 22:13:05 +0000 (14:13 -0800)]
b.t: Fix test sequence numbers
I should have tested more thoroughly before pushing
a462fa007.
Daniel Dragan [Mon, 5 Jan 2015 20:01:04 +0000 (15:01 -0500)]
fix test fail on unthreaded perl
../ext/B/t/b.t (Wstat: 65280 Tests: 0 Failed: 0)
Non-zero exit status: 255
Parse errors: No plan found in TAP output
part of [perl #123544]
Daniel Dragan [Mon, 5 Jan 2015 04:37:58 +0000 (23:37 -0500)]
const the custom op struct in Devel::Peek
This makes the Devel::Peek shared library free of perl caused RW static
data vars, and if CC/OS platform allows, removes RW data section from the
shared library.
Daniel Dragan [Mon, 5 Jan 2015 03:42:49 +0000 (22:42 -0500)]
pp.c pp_split GvAVn can't return NULL
clang optimized the function call free branch of GvAVn to skip the
"if (ary)" test, but the function call creation branch also will never
return NULL (but no CC knows that) so use goto to skip the test on both
halfs of GvAVn.
Daniel Dragan [Mon, 5 Jan 2015 03:27:04 +0000 (22:27 -0500)]
make B psuedofork safe
Previously B incorrectly used MY_CXT since commit
89ca4ac7af (5.7.2),
there was a MY_CXT declared, but it was never cloned after a win32
psuedofork, negating the whole point of using MY_CXT. This was probably
an oversight since the old code didn't use CLONE method, and would have
been threadsafe only if the module was loaded after a psuedofork/ithread
creation. Rearrange my_cxt_t so there isn't an alignment gap between the 32
bit and 32/64 bit ptrs.
This patch does not address the current lack of CLONE method in
Opcode:: and in File::DosGlob . File::Glob was fixed in commit
facf34ef48
DynaLoader in commit
8c472fc1d4 and re:: doesn't use MY_CXT anymore at all.
Failure message of the test before the fix was
not ok 88 - special SV table works after psuedofork
# Failed test 'special SV table works after psuedofork'
# at b.t line 229.
# got: 'B::PVNV'
# expected: 'B::SPECIAL'
Father Chrysostomos [Mon, 5 Jan 2015 07:18:02 +0000 (23:18 -0800)]
perldiag: Document ‘Bad symbol for scalar’
Originally this was a separate message in gv.c, with an exception
listed in diag.t
d5713896ec merged several functions together, changing the exception
to ‘Bad symbol for %s’.
bb85b28a added diag_listed_as in the wrong place.
de6f7947 moved it to the right place, removing the diag.t entry.
But all this time ‘Bad symbol for scalar’ remained undocumented.
Father Chrysostomos [Mon, 5 Jan 2015 07:16:58 +0000 (23:16 -0800)]
t/base/lex.t: Remove commented-out test
It has been commented out since it was added in
2b92dfceaa9d.
Father Chrysostomos [Mon, 5 Jan 2015 07:16:21 +0000 (23:16 -0800)]
complement can have OPpTARGET_MY
It always reads its argument out the outset and always returns its
target, so there is no reason its target cannot be a lexical. (The
OPpTARGET_MY optimisation makes $lexical = <some op> have the op
write directly to the lexical; the assignment gets optimised away.)
H.Merijn Brand [Mon, 5 Jan 2015 11:33:49 +0000 (12:33 +0100)]
Put pthread in front of libswanted and add cl
If pthread is found on HP-UX, it is required to be in front and
libcl is required too
Chris 'BinGOs' Williams [Sun, 4 Jan 2015 19:09:11 +0000 (19:09 +0000)]
Update Module-Metadata to CPAN version 1.000025
[DELTA]
1.000025 2015-01-04 18:56:00Z
- evaluate version assignment in a clean environment, to fix assignment in a
block (RT#101095)
Father Chrysostomos [Sun, 4 Jan 2015 02:55:49 +0000 (18:55 -0800)]
reg_nocapture.t: Skip %+ tests under miniperl
Father Chrysostomos [Sun, 4 Jan 2015 02:53:29 +0000 (18:53 -0800)]
perl.h:MY_CXT_CLONE: void *, not void **
C++ fails otherwise, and PL_my_cxt_list is void ** so individual
elephants (aka elements) should be void *.
Daniel Dragan [Sat, 3 Jan 2015 05:15:55 +0000 (00:15 -0500)]
const a table in B
B is now free of all RW static data except for my_cxt_index
Daniel Dragan [Fri, 2 Jan 2015 23:59:06 +0000 (18:59 -0500)]
reorder MY_CXT_CLONE for less memory reads
Nn VC 2003 32, taking a very simple CLONE XSUB, specifically
Time::HiRes::CLONE, shows a reduction from 0x53 to 0x47 bytes of machine
code. This is because my_cxt_index has to be reread after the memcpy
function call incase my_cxt_index was changed by memcpy (GCC usually
inlines short fixed length memcpys, on VC P5P perl, the option to inline
memcpy is off), also the new my_cxtp does not need to be saved in a non-vol
anymore, previously my_cxtp had to be copied to a non-vol for it be
available after the memcpy function call. In a simple XSUB like the one
mentioned here, saving and restoring the non-vol register is also
removed.
See details in perl #123534.
Father Chrysostomos [Sat, 3 Jan 2015 04:20:17 +0000 (20:20 -0800)]
pad.c: Obsolete comment
This comment, added by
3291825f, was made obsolete by
0f94cb1f.
Father Chrysostomos [Sat, 3 Jan 2015 04:15:10 +0000 (20:15 -0800)]
Fix CvOUTSIDE for state subs in predeclared subs
use 5.018;
use experimental 'lexical_subs';
$::x = "global";
sub x;
sub x {
state $x = 42;
state sub x { print eval '$x', "\n" }
\&x;
}
x()->();
__END__
Output:
Segmentation fault: 11
Because this line in pad.c:S_findpadlex:
1141 const PADLIST * const padlist = CvPADLIST(cv);
is trying to read this SV:
SV = UNKNOWN(0x76) (0xaa170e4fd) at 0x10060c928
REFCNT =
1697135711
FLAGS = (PADSTALE,TEMP,GMG,SMG,IOK,pNOK,pPOK,UTF8)
(i.e., gibberish).
During compilation, ‘sub x{’ creates a new CV. When the sub is about
to be installed (when the final ‘}’ is reached), the existing stub
must be reused. So everything is copied from the new CV (PL_compcv)
to the stub. Also, any CvOUTSIDE pointers of nested subs get updated
to point to the erstwhile stub.
State subs were not getting their CvOUTSIDE pointers updated. This
patch implements that.
Chad Granum [Sat, 3 Jan 2015 21:04:16 +0000 (13:04 -0800)]
Test-Simple Version Bump, 1.301001_096 (RC16)
Mainly fixes for older perls. Still important to bring this in line with
what is on cpan.
Hugo van der Sanden [Tue, 16 Dec 2014 14:50:09 +0000 (14:50 +0000)]
check more carefully for empty negative lookahead
We replace with OPFAIL, but if we wait till study_chunk() to do that it
gets rather more complicated.
Chris 'BinGOs' Williams [Sat, 3 Jan 2015 15:56:34 +0000 (15:56 +0000)]
Update IO-Socket-IP to CPAN version 0.35
[DELTA]
0.35 2015/01/02 19:45:20
[BUGFIXES]
* Restore blocking mode after timeout connect immediate success
(RT100947)
* Avoid CORE:: prefixing so global override modules work (RT101174)
* Ensure that ->peer{host,port,hostname,service} never die even when
unconnected (RT98759)
Craig A. Berry [Sat, 3 Jan 2015 03:27:21 +0000 (21:27 -0600)]
Revert "Fix PerlIO vtables on VMS."
This reverts commit
0c2c3d000e799a35bdc2bdd24feaf77cf854a2dd.
It's not needed after
400638aa931c47.
Craig A. Berry [Sat, 3 Jan 2015 00:42:31 +0000 (18:42 -0600)]
Ditch the custom extern/const model on VMS.
We've been using globaldef/globalref for global data since eons
ago. It was a requirement for the ancient and long-defunct VAXC
compiler (not to be confused with DEC C for OpenVMS VAX), but
DEC/Compaq/HP C supports extern and const pretty much the way
everybody else does, and has for many years. HP C also supports
globaldef/globalref for backward compatibility, but the C++ compiler
does not, so continuing to use it means two different models for
C and C++.
While there is a slight theoretical benefit to using the old model
and its fine-grained control of program section attributes and
having all the read-write variables in one program section and all
the read-only variables in another, there is no measureable
performance or code size benefit, and being different just isn't
worth the aggravation.
So let's resign ourselves to having a separate program section in
the shareable image for each global item and make a couple of places
in the code easier on everyone's eyeballs and less likely to collide
with other work.
Craig A. Berry [Fri, 2 Jan 2015 15:25:05 +0000 (09:25 -0600)]
Simplify PIC specification in perlshr.exe.
PIC has no meaning on Alpha as all code generated by the compiler
is position independent. So only specify it for VAX. This allows
us to get rid of the test for whether we are on Itanium.
Craig A. Berry [Fri, 2 Jan 2015 15:21:00 +0000 (09:21 -0600)]
Explicitly name linker map on VMS.
By default the linker takes the base name of the first object file
and uses that when creating the name of the linker map file, which
means we've been getting a file named dynaloader.map for the main
shareable image map. Name it after the target instead.
Craig A. Berry [Fri, 2 Jan 2015 15:17:57 +0000 (09:17 -0600)]
Remove dead line of code from vms/test.com.
Hasn't been needed since
34b5aed4c569.
Craig A. Berry [Thu, 1 Jan 2015 15:02:37 +0000 (09:02 -0600)]
Fix PerlIO vtables on VMS.
f0e5c859d36afe5 broke the build because it caused the PerlIO_funcs
declarations to be const in perlio.h and EXTPERLIO in perliol.h
and on VMS, EXTPERLIO was EXTCONST which is globalref. The compiler
considers globalref and const to be incompatible.
As a workaround, make EXTPERLIO "extern const" on VMS only. The
whole custom global data model on VMS probably needs a rethink,
but this gets the build working again.
Chris 'BinGOs' Williams [Wed, 31 Dec 2014 21:54:07 +0000 (21:54 +0000)]
Update ExtUtils-Manifest to CPAN version 1.70
[DELTA]
1.70 2014-12-31
- also skip _eumm, an artifact of ExtUtils::MakeMaker 7.05*
- avoid unreliable permissions tests on cygwin
Matthew Horsfall [Wed, 31 Dec 2014 16:05:57 +0000 (11:05 -0500)]
Perldelta for /n regexp flag. Also ?: to C<?:> in perlre.pod.
Steve Hay [Tue, 30 Dec 2014 12:00:54 +0000 (12:00 +0000)]
Remove sources of "unreferenced label" warning on Win32
and then remove the disabling of that warning.
Steve Hay [Wed, 31 Dec 2014 13:43:52 +0000 (13:43 +0000)]
Remove redundant -I..\lib arguments from some Win32 makefile command-lines
Invocations of $(PERLEXE) from win32\ do not need -I..\lib since $(PERLEXE)
is ..\perl.exe, which will pick up the lib\ folder in ..\ anyway.
Invocations of perl.exe from t\ (which may be a copy of either perl.exe or
miniperl.exe from the top-level folder) also do not need -I..\lib since
they all run the harness program, which fixes up @INC with exactly that
..\lib folder in a BEGIN block anyway.
Daniel Dragan [Tue, 23 Dec 2014 09:02:33 +0000 (04:02 -0500)]
make win32 harness process use tested perl binary
On Unix /t/perl is a symlink to /perl and the OS knows they are the same
file. On Win32 perl.exe and perl5**.dll are copied from / to /t, and the OS
thinks they are 2 separate files (and they are on disk). Both Win32 and
Unix use MMIO and COW/inter-process sharing for their running binaries. On
Unix the symlink means the 2 perl binaries will be memory mapped to the
same physical memory when running. On Win32 they won't be since they are 2
separate files. It is a waste of CPU cache/physical memory for the Win32
harness process and the child .t processes to not share the same disk
file/phy mem/same binary. Previously only the XS DLLs in /lib/auto were
shared between harness process and child .t processes, now perl.exe and
perl5**.dll will be shared between the 2 processes. Copying /perl.exe to
/t/perl.exe is from 1st commit of current Makefile in commit
68dc074516
and predates Win32 perl running harness which is from commit
137443ea0a
Also fix the broken "-I.\lib" in test-notty in makefile.mk . This problem
was discovered with VMMap. This patch is slightly related to
[perl #114704] .
David Mitchell [Mon, 22 Dec 2014 21:36:14 +0000 (21:36 +0000)]
Configure: silence ASan warnings
When run under -fsanitize=undefined, some of the try.c's that are compiled
and executed give runtime warnings. Since the intent of these particular
executables is to probe beyond certain limits in order to determine those
limits, these warnings can be safely ignored. So file them in /dev/null.
David Mitchell [Wed, 31 Dec 2014 11:16:06 +0000 (11:16 +0000)]
File::Glob: avoid qsort() on no entries
If a glob doesn't match anything, it will try to call qsort()
with a null pointer, and on my OS, qsort() marked as needing a non-null
arg, which clang 3.6 is now detecting.
David Mitchell [Tue, 23 Dec 2014 19:32:43 +0000 (19:32 +0000)]
clone PL_cv_has_eval and PL_savebegin
These two boolean vars weren't being cloned in new threads, and in
debugging builds were getting set to 0xab, which -fsanitize=undefined
regarded as no suitable value for a boolean.
David Mitchell [Tue, 23 Dec 2014 10:38:01 +0000 (10:38 +0000)]
sv_vcatpvfn_flags() avoid array bounds err
clang -fsanitize=undefined is being a bit too clever for its own good
here.
The code looks something like
U8 vhex[VHEX_SIZE];
...
v = vhex + ...;
if (v < vend) ...
The code itself is safe, but ASan detects if you've added a value
greater than the buffer size to vhex and whines.
I've changed it so that the conditional comes first and is done in such
a way that arbitrary values can't be added to vhex.
To reproduce:
printf "%.1000a\n", 1;
gives
sv.c:12327:34: runtime error: index 1000 out of bounds for type 'U8 [17]'
David Mitchell [Mon, 22 Dec 2014 20:57:52 +0000 (20:57 +0000)]
asan_ignore: exclude S_expect_number()
This function numifies the field width string in something like
printf "%10f". It handles integer overflow itself, so suppress
ASan warnings, e.g.
sv.c:10716:26: runtime error: signed integer overflow:
922337203 * 10 cannot be represented in type 'int'
David Mitchell [Mon, 22 Dec 2014 20:23:28 +0000 (20:23 +0000)]
fix integer overflow in S_study_chunk().
Don't increment delta if it's "infinity" (SSize_t_MAX)
Found by -fsanitize=undefined:
regcomp.c:4999:11: runtime error: signed integer overflow:
9223372036854775807 + 1 cannot be represented in type 'ssize_t' (aka 'long')
David Mitchell [Mon, 22 Dec 2014 20:12:22 +0000 (20:12 +0000)]
pack(): avoid << of negative values
Treat the string as U8* rather than char* when doing all the
bit shifts for uuencode. That stops these warnings under ASan:
pp_pack.c:1890:34: runtime error: left shift of negative value -127
pp_pack.c:1891:34: runtime error: left shift of negative value -126
pp_pack.c:1899:34: runtime error: left shift of negative value -1
pp_pack.c:1900:30: runtime error: left shift of negative value -31
David Mitchell [Mon, 22 Dec 2014 20:04:59 +0000 (20:04 +0000)]
avoid integer overflow in pp_flop()
This;
@a=(0x7ffffffffffffffe..0x7fffffffffffffff);
could produce under ASan:
pp_ctl.c:1212:19: runtime error: signed integer overflow:
9223372036854775807 + 1 cannot be represented in type 'IV' (aka 'long')
so avoid post-incrementing the loop var on the last iteration.
This fix is more to shut ASan up than an actual bug, since the
bad value on the last iteration wouldn't actually be used.
David Mitchell [Mon, 22 Dec 2014 16:25:59 +0000 (16:25 +0000)]
fix more -IV_MIN negations
Doing uv = -iv is undefined behaviour if iv happens to be IV_MIN.
This occurs in several places in the perl sources.
These ones were found by visual code inspection rather than
using -fsanitize=undefined, but I've added extra tests so that
-fsanitize could find them now.
David Mitchell [Mon, 22 Dec 2014 09:34:40 +0000 (09:34 +0000)]
fix undefined float behaviour in pack('f')
The C standard says that the value of the expression (float)double_var is
undefined if 'the value being converted is outside the range of values
that can be represented'.
So to shut up -fsanitize=undefined:
my $p = pack 'f', 1.
36514538e67;
giving
runtime error: value 1.36515e+67 is outside the range of representable values of type 'float'
explicitly handle the out of range values.
Something similar is already done under defined(VMS) && !defined(_IEEE_FP),
except that there it floors to +/- FLT_MAX rather than +/- (float)NV_INF.
I don't know which branch is best, and whether they should be merged.
This fix was suggested by Aaron Crane.
David Mitchell [Sun, 21 Dec 2014 00:40:13 +0000 (00:40 +0000)]
avoid integer overflow in Perl_av_extend_guts()
There were two issues; first the 'overextend' algorithm (add a fifth of
the current size to the requested size) could overflow,
and secondly MEM_WRAP_CHECK_1() was being called with newmax+1,
which could overflow if newmax happened to equal SSize_t_MAX.
e.g.
$a[0x7fffffffffffffff] = 1
$a[5] = 1; $a[0x7fffffffffffffff] = 1
could produce under ASan:
av.c:133:16: runtime error: signed integer overflow:
9223372036854775807 + 1 cannot be represented in type 'long'
av.c:170:7: runtime error: signed integer overflow:
9223372036854775807 + 1 cannot be represented in type 'long'
David Mitchell [Sun, 21 Dec 2014 00:00:10 +0000 (00:00 +0000)]
asan_ignore: exclude Perl_pp_left_shift()
<< in perl maps directly to << in C, so don't warn about it when the RHS
is too big.
Fixes e.g.:
print 1 << 64
use integer; print 1 << 63
Typical ASan warning:
pp.c:1893:2: runtime error: left shift of 1 by 63 places cannot be represented in type 'IV' (aka 'long')
David Mitchell [Sat, 20 Dec 2014 16:40:52 +0000 (16:40 +0000)]
fix -IV_MIN negations
Doing uv = -iv is undefined behaviour if iv happens to be IV_MIN.
This occurs in several places in the perl sources.
Found by -fsanitize=undefined.
Here's a typical message:
sv.c:2864:7: runtime error: negation of -
9223372036854775808 cannot be represented in type 'IV' (aka 'long'); cast to an unsigned type to negate this value to itself
David Mitchell [Sat, 20 Dec 2014 15:30:01 +0000 (15:30 +0000)]
fix integer overflow in S_study_chunk().
It was calculating final_minlen + delta even when delta was already
SSize_t_MAX and final_minlen > 0.
This triggered it: /a(??{}){2}/.
Found by -fsanitize=undefined:
regcomp.c:5623:89: runtime error: signed integer overflow: 1 +
9223372036854775807 cannot be represented in type 'long'
Karl Williamson [Sun, 14 Dec 2014 17:39:14 +0000 (10:39 -0700)]
handy.h Cast to unsigned before doing xor
It occurred to me that these macros could have an xor applied to a
signed value if the argument is signed, whereas the xor is expecting
unsigned.
Karl Williamson [Mon, 22 Dec 2014 05:02:30 +0000 (22:02 -0700)]
Empty \N{} in regex pattern should force /d to /u
\N{} is for Unicode names, even if the name is actually omitted.
(Accepting an empty name is, I believe, an accident, and now is
supported only for backwards compatibility.)
Karl Williamson [Mon, 22 Dec 2014 04:47:04 +0000 (21:47 -0700)]
regcomp.c: comment and white-space changes only
Karl Williamson [Wed, 31 Dec 2014 03:49:25 +0000 (20:49 -0700)]
warnings.pm: Fix too long verbatim lines
By not indentins verbatim text so much, we don't run over 79 columns.
Karl Williamson [Wed, 31 Dec 2014 03:48:26 +0000 (20:48 -0700)]
perlre: Fix too long verbatim line
Karl Williamson [Wed, 31 Dec 2014 03:50:39 +0000 (20:50 -0700)]
lib/B/Deparse.pm: refactor a hash slightly
Two of the three uses of this hash want the result to be of the form
"\cX". The other wants "^X". This changes the hash to be the common
substring to all three, and then the proper prefix is added to each.
Karl Williamson [Tue, 30 Dec 2014 21:13:34 +0000 (14:13 -0700)]
lib/B/Deparse.pm: Add comment
Karl Williamson [Tue, 30 Dec 2014 21:04:10 +0000 (14:04 -0700)]
lib/B/Deparse.pm: Generalize for non-ASCII platforms
This makes ASCII platform-specific code generalized to non-ASCII.
Karl Williamson [Tue, 30 Dec 2014 21:09:40 +0000 (14:09 -0700)]
lib/B/Deparse.pm: Output WARNING_BITS in binary
This binary value was being output as just another string, which would
cause the bit patterns that coincidentally coincided with letters to be
output as those. This is not portable to EBCDIC, but outputting it as
\xXX is, which this commit does. I chose to output in hex instead of
octal, as I think that is the more modern thing to do, and it's easier
for me to grok the larger values when they are in hex.
Karl Williamson [Tue, 30 Dec 2014 20:55:42 +0000 (13:55 -0700)]
lib/B/Deparse.pm: Move hash to earlier in file
No other change besides the move is done. This is so the hash can be
used from another place than currently.
Karl Williamson [Mon, 29 Dec 2014 20:57:10 +0000 (13:57 -0700)]
perlpod: Latin1 pods need an =encoding
Karl Williamson [Thu, 25 Dec 2014 20:16:19 +0000 (13:16 -0700)]
regcomp.c: Fix [_A-Z] for EBCDIC
Special handling is required on EBCDIC for ranges that are subsets of
either a-z or A-Z. This is triggered when both ends are literals. It
is implemented by keeping a count of the literal endpoints, and when
that is two do the handling. But the count was not getting reset, so
it could go to 3, 4, ... so the special handling would only get
triggered if the range was the first thing in the brackets,
like [A-Z], but not if there was something before it, like [_A-Z]. The
solution is to reset the counter appropriately each time through the
loop. For the A-Z range, the ASCII-equivalent characters wrongly
matched were backslash and '}'. For a-z, it was '~'
Karl Williamson [Thu, 25 Dec 2014 20:15:58 +0000 (13:15 -0700)]
regcomp.c: Replace dead code with NOT_REACHED
Daniel Dragan [Sun, 28 Dec 2014 20:59:38 +0000 (15:59 -0500)]
fix a broken optimization in win32/config_h.PL to stop excessive rebuilding
In commit
137443ea0a config_h.PL was introduced. There is no ML archive
from that time of the actual patches or their rational. From day 1 of
config_h.PL for the root config.h, it didn't copy the new one config.h to
the normal location of config.h if the files matched. This prevents
redundant dirtying of all core moudules with the
"Makefile out-of-date with respect to "/make clean/rerunning of makefile.pl
/new make all cycle. But the optimization didn't work in practice since
the modules declare a dependency on /lib/CORE/config.h not /config.h.
Previously "touch"ing /win32/Makefile would trigger a mass rebuild,
even if config.h's contents are the same. Now only if the new after
"touch"ing /win32/makefile config.h is different from the old config.h
, will a mass rebuild of module be triggered. This makes reduced the
amount of time core devs have to spend to work on Win32 perl.
Matthew Horsfall [Tue, 30 Dec 2014 00:21:39 +0000 (19:21 -0500)]
Add documentation for /n (non-capture) regexp flag.
Father Chrysostomos [Mon, 29 Dec 2014 14:24:12 +0000 (06:24 -0800)]
lex_assign.t: Actually test chomp
Father Chrysostomos [Mon, 29 Dec 2014 14:22:07 +0000 (06:22 -0800)]
lex_assign.t: Correct (s)cho(m)p comments
The were backwards. ‘s’ means a single item.
Father Chrysostomos [Mon, 29 Dec 2014 14:16:33 +0000 (06:16 -0800)]
op_private: Update note about targlex and trans
Father Chrysostomos [Mon, 29 Dec 2014 14:13:48 +0000 (06:13 -0800)]
Enable OPpTARGET_MY optimisation for cmp/<=>
We can only do it for <=> under ‘use integer’.
The non-integer <=> will push undef on to the stack. Enabling
the optimisation for it would cause \($lexical = $x <=> "nan") to
leave $lexical with its previous value and return a reference to
&PL_sv_undef.
Karl Williamson [Tue, 30 Dec 2014 01:27:42 +0000 (02:27 +0100)]
Fix breakage of 780fcc9
I got confused in writing this: the global needs to be cleared always,
and set to NULL.
Karl Williamson [Tue, 30 Dec 2014 01:39:40 +0000 (18:39 -0700)]
regexec.c: Suppress warning messages
A message on some compilers is geing generated that two variables may be
unininitialized. In fact there is no path through that uses them
thusly, but initialize them anyway where the compiler is wrong.
Karl Williamson [Mon, 29 Dec 2014 20:15:57 +0000 (13:15 -0700)]
Raise warning on multi-byte char in single-byte locale
See http://nntp.perl.org/group/perl.perl5.porters/211909
Something is quite likely wrong with the logic if say in a Greek locale,
Unicode characters (especially Greek ones) are encountered. The same
character will be represented by two different code points. This
warning alerts the user to this undesirable state of affairs.
Karl Williamson [Mon, 29 Dec 2014 19:57:02 +0000 (12:57 -0700)]
perllocale: Nits