This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Father Chrysostomos [Wed, 19 Aug 2015 20:10:16 +0000 (13:10 -0700)]
Disable lexical $_
This just disables the syntax and modifes the tests. The underlying
infrastructure has not been removed yet.
I had to change a couple of tests in cpan/.
Craig A. Berry [Sat, 26 Sep 2015 22:24:57 +0000 (17:24 -0500)]
killpg for VMS.
Implement our own killpg by scanning for processes in the specified
process group, which may not mean exactly the same thing as a Unix
process group, but at least we can now send a signal to a parent (or
master) process and all of its sub-processes. In Perl-land, this
means we can now send a negative pid like so:
kill SIGKILL, -$pid;
to signal all processes in the same group as $pid.
Karl Williamson [Thu, 3 Sep 2015 00:00:55 +0000 (18:00 -0600)]
Make ext/XS-APItest/t/cophh.t work on EBCDIC
The new EBCDIC-only code will also work on ASCII platforms, but I left
the ASCII code as-is.
Karl Williamson [Wed, 23 Sep 2015 03:31:02 +0000 (21:31 -0600)]
t/re/pat.t: EBCDIC fix
Jarkko Hietaniemi [Fri, 25 Sep 2015 12:10:45 +0000 (08:10 -0400)]
Clarify FIRSTKEY and NEXTKEY usage.
Rafael Garcia-Suarez [Fri, 25 Sep 2015 07:29:28 +0000 (09:29 +0200)]
POD fix in the documentation for SvTHINKFIRST
Tony Cook [Tue, 22 Sep 2015 23:33:50 +0000 (09:33 +1000)]
[perl #126133] autodie touches its touch_me, make it writable
autodie's utime.t touches touch_me, in a git checkout that's fine
since the file is writable, but in a distribution, all files are
read-only by default, and on Win32 the utime() call the test expects
to succeed fails.
Per Sisyphus's note, also make win32/GNUmakefile writable to match the
other Win32 makefiles, since they're often modified to configure the
build.
Chris 'BinGOs' Williams [Tue, 22 Sep 2015 22:52:05 +0000 (23:52 +0100)]
Fix typo and Module-CoreList is 5.
20150920 on the CPAN now
Peter Martini [Mon, 21 Sep 2015 13:19:14 +0000 (09:19 -0400)]
Update Module::CoreList
Peter Martini [Mon, 21 Sep 2015 12:37:35 +0000 (08:37 -0400)]
Bump the perl version in various places for 5.23.4.
Peter Martini [Mon, 21 Sep 2015 11:28:16 +0000 (07:28 -0400)]
Update release_schedule
Tick off 5.23.3
Peter Martini [Mon, 21 Sep 2015 11:27:05 +0000 (07:27 -0400)]
Porting/new-perldelta.pl regenerations
Peter Martini [Mon, 21 Sep 2015 11:04:57 +0000 (07:04 -0400)]
Add epigraph for 5.23.3
Peter Martini [Mon, 21 Sep 2015 01:38:35 +0000 (21:38 -0400)]
Add 5.23.3 to perlhist
Peter Martini [Mon, 21 Sep 2015 01:28:13 +0000 (21:28 -0400)]
Finalize perldelta with Acknowledgments
Peter Martini [Mon, 21 Sep 2015 00:11:37 +0000 (20:11 -0400)]
Update Module::CoreList for 5.23.3
Peter Martini [Mon, 21 Sep 2015 01:19:58 +0000 (21:19 -0400)]
perldelta updates for 5.23.3
Peter Martini [Mon, 21 Sep 2015 01:46:53 +0000 (21:46 -0400)]
Remove unneeded ", from perldiag entry
Father Chrysostomos [Sun, 20 Sep 2015 22:07:36 +0000 (15:07 -0700)]
op.c: ck_match does not use its context
g++ told me so.
Father Chrysostomos [Sun, 20 Sep 2015 22:06:39 +0000 (15:06 -0700)]
[perl #126064] Apply scalar context to stat args
If we don’t apply scalar context to stat’s argument, then it doesn’t
get its context marked:
$ ./perl -Ilib -MO=Concise -le 'stat stat stat'
7 <@> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e:1) v:{ ->3
6 <1> stat vK/1 ->7
5 <1> stat K/1 ->6
4 <1> stat K/1 ->5
- <1> ex-rv2sv sK/1 ->4
3 <#> gvsv[*_] s ->4
-e syntax OK
and it might think that it is in void context at run time:
$ ./perl -Ilib -le 'print 1, 2, 3,(stat stat stat), 4, 5, 6'
1456
It ate my stack items!
If it reads past the beginning of the stack, it can crash.
Just apply scalar context, and Bob’s your uncle, of course.
Craig A. Berry [Sun, 20 Sep 2015 02:54:02 +0000 (21:54 -0500)]
Remove VMS-specific bits of OP_KILL.
The rationale for this change almost twenty years ago was that
the "CRTL's emulation of Unix-style signals and kill()" couldn't
send signals that got noticed by images running in supervisor
mode. This hasn't been true of the CRTL for some time, and we
haven't been using the CRTL's kill for a very long time either.
So remove this once-but-no-longer-necessary hack. Experiments
show that it is still possible to kill any process we want,
assuming the signalling process has the necessary privileges (or
owns the target process).
TODO: implement killpg() -- if Win32 can do it, surely it's
possible on VMS.
Karl Williamson [Thu, 27 Aug 2015 05:16:35 +0000 (23:16 -0600)]
Add API tests for utf8.h macros
Karl Williamson [Mon, 3 Aug 2015 03:20:44 +0000 (21:20 -0600)]
Change meaning of UNI_IS_INVARIANT on EBCDIC platforms
This should make more CPAN and other code work without change. Usually,
unwittingly, code that says UNI_IS_INVARIANT means to use the native
platform code values for code points below 256, so acquiesce to the
expected meaning and make the macro correspond. Since the native values
on ASCII machines are the same as Unicode, this change doesn't affect
code running on them.
A new macro, OFFUNI_IS_INVARIANT, is created for those few places that
really do want a Unicode value. There are just a few places in the Perl
core like that, which this commit changes.
Ricardo Signes [Fri, 18 Sep 2015 17:29:43 +0000 (13:29 -0400)]
Update Encode to CPAN version 2.77
[DELTA]
$Revision: 2.77 $ $Date: 2015/09/15 13:53:27 $
! Unicode/Unicode.xs Unicode/Unicode.pm
Address RT#107043: If no BOM is found, the routine dies.
When you decode from UTF-(16|32) without -BE or LE without BOM,
Encode now assumes BE accordingly to RFC2781 and the Unicode
Standard version 8.0
https://rt.cpan.org/Public/Bug/Display.html?id=107043
! Makefile.PL encoding.t
Mend pull/42
! Encode.xs Makefile.PL encoding.pm encoding.t
Pulled: precompile 1252 table as that is now the Pod::Simple default
https://github.com/dankogai/p5-encode/pull/42
Shlomi Fish [Fri, 4 Sep 2015 19:26:12 +0000 (22:26 +0300)]
Tentative fix for RT#125350 - AFL detected crash.
Craig A. Berry [Wed, 16 Sep 2015 23:52:32 +0000 (18:52 -0500)]
Make MM_VMS::oneline build continuation lines properly.
Tracking upstream commit
dd1e236abed699069 because without
it the build is broken.
Daniel Dragan [Sun, 13 Sep 2015 03:30:46 +0000 (23:30 -0400)]
Win32 misc parallel fixes in win32/makefile.mk
-reonly/Extensions_reonly target, which is never used, didn't work in
parallel because it was using left to right execution of an upstream dep
to create build products, that is incompatible with parallel building, fix
by trimming down the list of deps, $(UNIDATAFILES) and Extensions_reonly
know how to build themselves
-regnodes psuedotarget is redundant, it is just an alias for ..\regnodes.h
which isn't a build product, remove regnodes and just use ..\regnodes.h
instead, smaller build graph/less parsing for dmake tool
-I am not questioning relationship between reonly, ..\regnodes.h,
..\regcharclass.h, ..\regcomp.o, $(UNIDATAFILES), Extensions_reonly
since regnodes.h and regcharclass.h are git tracked files and not build
products, and things work well enough as is
-perlglob.exe is needed to build extensions, the natural race conditions
that exist in parallel building ment that it was usually getting built
early enough that it being missing wasn't noticed, and "rebasePE" target
made sure it existed eventually. Some Makefile.PLs indirectly warned that
perlglob was missing from the cmd.exe complaining about perlglob being
missing but didnt cause an non-zero to happen from the Makefile.PL
process. Also since perlglob.exe is installed into the final installed
perl, probably pointlessly since full perl is not built with
PERL_EXTERNAL_GLOB I think an installed perl's perlglob.exe was being
used when I developed commit
3bdc51af3f and related patches. Since re,
DynaLoader, and lib are a very limited fixed list of modules, and they
dont need perlglob.exe, they dont need to get it as a dep.
-reorder the deps in Extensions_static and Extensions_nonxs so permanent
files, rarely changed files are on the left side, and build products are
on the right. Maybe some kind of optimization happens inside dmake due to
the first couple deps being already built (because they are permanent).
-remove ..\lib\buildcustomize.pl dep, it is redundant. Its other name is
HAVEMINIPERL, and CONFIGPM can't exist without miniperl. Less nodes in
dmake's internal graph, since dmake's dep finding algorithm is very
inefficient and repetitive.
-gmake is supported since commit
342634f3c8 but GNUmakefile doesn't
support parallel (-j) building
Nicolas R [Fri, 11 Sep 2015 14:23:39 +0000 (09:23 -0500)]
Remove legacy/dead code from B
B was still using some PERL_VERSION checks
in multiple places whereas it's part of core.
This commit removes this dead code and bump B::VERSION.
For archeology we can still use git if we want to know
what it looks like in an older version.
Karl Williamson [Wed, 16 Sep 2015 21:58:27 +0000 (15:58 -0600)]
regexec.c: Use Perl_croak_nocontext()
Instead of doing a dTHX introduced in
22b433eff9a1ffa2454e18405a56650f07b385b5. I should have pointed out in
that commit message, that instead of doing a full-fledged UTF-8
well-formedness check, it does a quick sanity check sufficient to
prevent looping
Spotted by Vincent Pitt
Jarkko Hietaniemi [Wed, 16 Sep 2015 21:05:19 +0000 (17:05 -0400)]
Revert "amigaos4: flock unimplemented"
This reverts commit
24631c4f6929bc824e657b74b2edfada4c8d05b0.
The new flock emulation for amigaos now tested with parallel builds
and found to fare well.
Karl Williamson [Wed, 16 Sep 2015 20:34:31 +0000 (14:34 -0600)]
PATCH [perl #123562] Regexp-matching "hangs"
The regex engine got into an infinite loop because of the malformation.
It is trying to back-up over a sequence of UTF-8 continuation bytes.
But the character just before the sequence should be a start byte. If
not, there is a malformation. I added a test to croak if that isn't the
case so that it doesn't just infinitely loop. I did this also in the
similar areas of regexec.c.
Comments long ago added to the code suggested that we check for
malformations in the vicinity of the new tests. But that was never
done. These new tests should be good enough to prevent looping, anyway.
Karl Williamson [Wed, 16 Sep 2015 14:48:29 +0000 (08:48 -0600)]
regcomp.c: Safer handling of malformed UTF-8
This commit just changes a test to look for UTF-8 invariants instead of
legal UTF-8 start characters. The effective difference is that now all
non-invariants go to the general utf8 handling function, which is
equipped to find malformed UTF-8. Previously, this code would
improperly accept malformations that were illegal start characters or
continuation characters.
Zachary Storer [Wed, 16 Sep 2015 18:02:11 +0000 (12:02 -0600)]
Add 2 books to perlbook
Karl Williamson [Wed, 16 Sep 2015 18:00:52 +0000 (12:00 -0600)]
Add Zachary Storer to AUTHORS
Karl Williamson [Wed, 16 Sep 2015 17:56:44 +0000 (11:56 -0600)]
perlbook: Decrease indent of verbatim lines
so that fewer are likely to exceed 79 columns
Karl Williamson [Wed, 16 Sep 2015 17:45:42 +0000 (11:45 -0600)]
perlbook: Add some L<> links
Jarkko Hietaniemi [Tue, 15 Sep 2015 13:05:11 +0000 (09:05 -0400)]
amigaos4: whitespace only, in amigaos4/
Consistent formatting (and using "Andy Broad" style) for the amigaos4 code:
astyle --style=bsd --indent=tab=4 amigaos4/*.[hc]
(amigaos patch preparation script automates this)
Andy Broad [Sun, 13 Sep 2015 21:40:23 +0000 (17:40 -0400)]
amigaos4: whitespace only
For preprocessor code use 2-indent instead of 4-indent.
Andy Broad [Sun, 13 Sep 2015 18:53:59 +0000 (14:53 -0400)]
amigaos4: use #ifdef/ifndef __amigaos4__ when feasible
Andy Broad [Mon, 14 Sep 2015 14:28:14 +0000 (10:28 -0400)]
amigaos4: declare the amigaos protos in amigaos.h
Andy Broad [Tue, 15 Sep 2015 13:01:12 +0000 (09:01 -0400)]
amigaos4: better kill() implementation
(the underlying UNIX emulation has changed)
Andy Broad [Sun, 13 Sep 2015 23:55:41 +0000 (19:55 -0400)]
amigaos4: implement flock() emulation
Beware: not an exact implementation, the locks follow the OS level
filehandle not the process.
Andy Broad [Sun, 13 Sep 2015 18:37:43 +0000 (14:37 -0400)]
amigaos4: move the amigaos exec code under amigaos4
Largely reimplements
839a9f02,
54fa14d7,
e8432c63,
40262ff4.
The upside is that now doio.c and pp_sys.c have much less AmigaOS
specific ifdefs. As a downside, the exec code is now forked (pun
only partially accidental.)
The earlier story regarding fork+exec, that the AmigaOS creating
thread doesn't terminate but instead continues running is both true
and false. The more detailed story is that the user-observable
behaviour is as with POSIX/UNIX. The thread that created the new
"task" (to use the AmigaOS terms) does hang around -- but all it
does is to wait for the new task to terminate, and more importantly,
it holds on to the resources like filehandles. If the task were to
immediately terminate, the resources would be reclaimed by the kernel.
Andy Broad [Sun, 13 Sep 2015 23:49:34 +0000 (19:49 -0400)]
amigaos4: AmigaOS extensions need no ppport.h since in ext/
Andy Broad [Sun, 13 Sep 2015 23:47:35 +0000 (19:47 -0400)]
amigaos4: minimized config.sh
Andy Broad [Mon, 14 Sep 2015 14:37:45 +0000 (10:37 -0400)]
amigaos4: Configure: syslog extension not feasible
Karl Williamson [Wed, 16 Sep 2015 02:06:39 +0000 (20:06 -0600)]
Fix too-long verbatim lines in perlfunc
These were added by
29b04a70d1bf9a10be65363f3f8d6dae44cfa6fc
Karl Williamson [Fri, 27 Mar 2015 03:46:19 +0000 (21:46 -0600)]
if.pm: Better failure message for 'no if'
It previously always said 'use if', even if 'no if' was what was
specified
Karl Williamson [Mon, 9 Mar 2015 18:37:24 +0000 (12:37 -0600)]
PATCH [perl #120790] Unicode::UCD failure to warn on bad input
This ticket was originally because the requester did not realize the
function Unicode::UCD::charscript took a code point argument instead of
a chr one. It was rejected on that basis. But discussion here
suggested it would be better to warn on bad input instead of just
returning <undef>. It turns out that all other routines in Unicode::UCD
but charscript and charblock already do warn. This commit extends that
to the two outlier returns.
Tony Cook [Tue, 15 Sep 2015 00:12:04 +0000 (10:12 +1000)]
prevent op/time.t failures on NetBSD 5.1
- make the watchdog time exceed the maximum time for the "very basic times
test". This doesn't prevent the test from failing, but prevents the
entire test script from being killed by the watchdog if the times()
test does fail
- do more work inside the loop, with the previous "burn cycles" loop
system time was increasing but user time stayed at its starting value.
James E Keenan [Tue, 8 Sep 2015 12:59:49 +0000 (08:59 -0400)]
Add test for sprintf ordering by both explicit index and not.
Commit 638ca15 earlier in the 5.23 development cycle corrected a
long-standing bug in sprintf. Not surprisingly, code outside the
core built on this bug will now exhibit a different behavior.
CPAN library Text-sprintfn is one such case. One test in its
test suite began to fail; see
https://rt.cpan.org/Ticket/Display.html?id=105989.
This commit adds the test which failed in Text-sprintfn's t/01-basic.t to our
t/op/sprintf.t with the corrected test result. It also adds a 'printf'
version of that corrected expectation to pod/perlfunc.pod.
For: RT #125956
Ricardo Signes [Mon, 14 Sep 2015 15:50:23 +0000 (11:50 -0400)]
release managers for September and December 2015
Chris 'BinGOs' Williams [Mon, 14 Sep 2015 12:19:49 +0000 (13:19 +0100)]
Update ExtUtils-MakeMaker to CPAN version 7.10
[DELTA]
7.10 Thu Sep 10 19:38:55 BST 2015
Bug fixes:
- Fix an issue with quoting of dist_ci target on Win32
7.08 Tue Sep 8 20:24:15 BST 2015
This release reverts all the changes since v7.04 until such time
as the regressions we have found in the "wild" of CPAN can be
tamed
ExtUtils::Command has been included in this release as it was
reincorporated in v7.06
The following bug fixes have also been included:
- RT#100268 fix wrong variable being used
- Check exit status for commands in "make ci" target
- Fix distsignature dependencies for parallel make
- The bundled Encode::Locale has been updated to 1.04
Chris 'BinGOs' Williams [Mon, 14 Sep 2015 12:18:22 +0000 (13:18 +0100)]
Remove ExtUtils-Command, it is merged in EUMM now
Chris 'BinGOs' Williams [Mon, 14 Sep 2015 12:15:17 +0000 (13:15 +0100)]
Module-CoreList-5.
20150912 is on the CPAN
Chris 'BinGOs' Williams [Mon, 14 Sep 2015 12:13:37 +0000 (13:13 +0100)]
Update experimental to CPAN version 0.014
[DELTA]
0.014 2015-09-12 00:29:37+02:00 Europe/Amsterdam
Add bitwise to list of known features
David Mitchell [Mon, 14 Sep 2015 13:13:27 +0000 (14:13 +0100)]
Revert "#126039 regexec.c: Fix compiler warning"
This reverts commit
801fcc250783bc56ec8033a5940b3257bcd9a7db.
This commit fixed some compiler warnings in S_regmatch() by adding
a new function-scoped var. I have a better fix - to be applied shortly -
that instead uses tmp boolean vars declared in a small scope as and where
needed.
Dan Collins [Sun, 13 Sep 2015 15:24:14 +0000 (09:24 -0600)]
PATCH [perl #126039] regexec.c: Fix compiler warning
Karl Williamson [Sun, 13 Sep 2015 15:30:38 +0000 (09:30 -0600)]
Add Dan Collins to AUTHORS
Karl Williamson [Sun, 13 Sep 2015 03:02:24 +0000 (21:02 -0600)]
PATCH [perl #126036] toke.c: Silence some compiler warnings
The ticket proposes a new format to output IV's as hex using capital
letters for the digits A-F. However, this isn't necessary in this case,
as even though these are IV's, they can never be negative, and we have
an existing format that prints these fine.
More work needs to be done to fix the problem if something larger than
an IV is used (currently it loops).
Karl Williamson [Sat, 12 Sep 2015 17:39:37 +0000 (11:39 -0600)]
regcomp.c: Simplify some code
Commit
2d3d6e6e7c2d50b1cc47032cf089151823fb20a6 introduced the
'optimizable' variable which if FALSE prevents the [...] node from being
optimized, if otherwise possible, into something simpler. It turns out
that several of the conditions which prevent such optimization can just
clear this flag when they are found, rather than having to test for the
conditions again later when the optimization is actually done.
Karl Williamson [Sat, 12 Sep 2015 17:34:57 +0000 (11:34 -0600)]
regcomp.c: Comment changes only
Karl Williamson [Tue, 25 Aug 2015 03:09:02 +0000 (21:09 -0600)]
PATCH: [perl #125892] qr/(?[ ]) regression with '!'
This regression was introduced in 5.22. It stems from a logic error I
made in a complicated 'if' statement.
Karl Williamson [Sat, 12 Sep 2015 16:10:59 +0000 (10:10 -0600)]
regcomp.c: Add synonym for macro complement
OPERAND and OPERATOR are here complements of each other.
It's better to refer to the thing you are manipulating instead of
{! the thing you aren't}.
Steve Hay [Sat, 12 Sep 2015 20:10:55 +0000 (21:10 +0100)]
Update release schedule
5.20.3 is now done, slightly later than was planned
5.22.1 is coming next, but will probably be October now
Steve Hay [Sat, 12 Sep 2015 19:42:00 +0000 (20:42 +0100)]
Copy perl5203delta into blead
Steve Hay [Sat, 12 Sep 2015 19:21:58 +0000 (20:21 +0100)]
Add epigraph for 5.20.3
Steve Hay [Sat, 12 Sep 2015 17:40:54 +0000 (18:40 +0100)]
5.20.3 today
Steve Hay [Sat, 12 Sep 2015 17:39:49 +0000 (18:39 +0100)]
Module::CoreList: Fill in date for 5.20.3 release
Karl Williamson [Sat, 12 Sep 2015 13:19:03 +0000 (07:19 -0600)]
toke.c: Silence some compiler warnings
These were spotted by Daniel Dragan on Win32.
Lukas Mai [Fri, 11 Sep 2015 19:40:21 +0000 (21:40 +0200)]
replace references to "-w" by strict/warnings
Thomas Sibley [Thu, 10 Sep 2015 05:51:06 +0000 (22:51 -0700)]
English.pm: Only alias $- to $FORMAT_LINES_LEFT
Avoids aliasing %- and @- as %FORMAT_LINES_LEFT and @FORMAT_LINES_LEFT.
I audited the rest of perlvar and English.pm for over-eager aliasing of
unrelated variables but found no other cases.
Thomas Sibley [Thu, 10 Sep 2015 04:35:09 +0000 (21:35 -0700)]
perlvar: %LAST_MATCH_START is not %-
%- has no English alias. %FORMAT_LINES_LEFT works in practice thanks to
indiscriminate typeglob aliasing, but that really doesn't count!
%LAST_PAREN_MATCHES or %LAST_PAREN_ALL_MATCH might be a reasonable
addition for %-, to parallel %+ and %LAST_PAREN_MATCH.
Karl Williamson [Fri, 11 Sep 2015 16:08:50 +0000 (10:08 -0600)]
PATCH: [perl #125990] panic: reg_node overrun
This is a result of a design flaw that I introduced in earlier releases
when attempting to fix earlier design flaws in dealing with the outlier
character ß, LATIN SMALL LETTER SHARP S. The uppercase of this letter
is SS, so that when comparing case-insensitively, it should match 'ss',
and hence, in Unicode terminology, it folds to 'ss'. This character is
the only one representable without using UTF-8 whose fold is longer than
1 byte, and so has to have special treatment. Similarly, the sequence
'ss' can match caselessly the single byte ß, and this is the only such
sequence that can match something shorter than it, unless UTF-8 is
involved. The matter is complicated by the fact that under /di rules,
the ß and 'ss' don't match each other, unless the target string is in
UTF-8. The solution I used earlier (and continue to use) was to create
a special regnode EXACTFU_SS under /ui rules, in which any ß is folded
to 'ss'. But under /di rules, a regular EXACTF regnode is used, and any
ß is retained as-is.
The problem reported here arises when something during the sizing pass
tells perl to use /ui rules rather than the /di rules that were in
effect at the beginning. Recall that perl uses /d rules, for backward
compatibility, unless something overrides them. This can be a 'use'
declaration, an explicit character set pattern modifier, or something in
the pattern. This bug happens only with the final case. There are
several Unicode-defined constructs that can occur in patterns; if one is
found, the perl interpreter infers that Unicode is desired, and switches
from /d to /u for the whole pattern. Two such constructs are a Unicode
property, \p{}, and a Unicode named character, \N{}. The
problem-reproducing code for this ticket uses the latter.
The problem was that the switch from /di to /ui was deferred until AFTER
the sizing pass. (A flag was set when one of these constructs was
encountered to tell the parser to later do the switch.)
During the second pass, the code realizes it is under /ui, so creates an
EXACTFU_SS node and folds the ß into 'ss'. But the first pass thought
it was under /di, so it sized for just the ß, i.e., for 1 byte, so we
exceed the allocated space and do a wild write. This may not cause a
problem if the malloc'd space had rounded-up and there were only a few
of these ß characters.
One solution I considered was just keeping a global count of the ß
characters in EXACTF nodes. One could just add these to the space
reserved if /ui rules ended up being used. The problem with this is that
nodes that are near their maximum size without the extra space could
exceed it with, and thus have to be split into 2 nodes, and the extra
node would have an unplanned-for header, taking up more unaccounted-for
space. So that doesn't work. One could also just reserve two bytes for
every ß in an EXACTF node, thus wasting space unless /ui ends up being
used. But the bigger problem is that the code that splits nodes would
have to be made more complicated. It has to find a suitable splitting
spot, by searching through the text of the node, and now it would have
to deal with some of that space not being set.
Instead, I opted to change the code so that when it finds one of these
Unicode-defined constructs, it switches to /u immediately during the
sizing pass. That means that the parse afterwards knows that it is /u
and allocates the correct space. (We now have to remain in /u for the
remainder of the pass, so some code had to change that reverted this.)
This fixes the test case in the ticket. But there remains a problem if
the sizing has happened earlier in the parse before the construct that
changes from /d to /u is encountered. Like:
qr/.....ß....\N{}/di
The incorrect sizing has already happened by the time the \N{} is
encountered. One could solve this by restarting the parse whenever the
/d goes to /u (under /i, as this issue isn't a problem except when
folding ß). That slows things down. Instead, I opted to set a global
flag whenever a ß is found in an EXACTF node. If that flag isn't set at
the time of the /d to /u switch, there's no need to restart the parse.
A 'use utf8' or 'use 5.012' or higher selects /u over /d, so the problem
did not happen with them, nor if the pattern has to be converted to
UTF-8, which restarts the sizing pass, and it only happens with the
sharp s character. And probably unless there a several ß characters,
the rounding-up of malloc space, would cause this to not be an issue.
These explain why this hasn't been reported from the field.
Karl Williamson [Thu, 10 Sep 2015 01:40:09 +0000 (19:40 -0600)]
regcomp.c: Split an internal flag into 2
This splits the flag used to communicate between parsing layers that the
sizing pass needs to be restarted and the pattern upgraded to UTF-8. It
is split into a bit meaning to restart pass1 and a bit to do the
upgrade. This is in preparation for the next commit which will have a
2nd reason to restart pass1.
Karl Williamson [Fri, 11 Sep 2015 16:09:06 +0000 (10:09 -0600)]
regcomp.c: Reorder a test
Prior to this commit, the code tested for some side effects before
testing if the called function even succeeded. This hasn't been a
problem before, because the called function didn't fail when called
from this context. But a future commit will change that.
Karl Williamson [Thu, 10 Sep 2015 01:37:02 +0000 (19:37 -0600)]
regcomp.c: Add assertion and parameter to macro
It's clearer and safer to pass the name of a local variable to a macro,
rather than assuming the macro knows the correct name.
Karl Williamson [Wed, 9 Sep 2015 18:43:15 +0000 (12:43 -0600)]
regcomp.c: Fix, clarify comments
Karl Williamson [Fri, 11 Sep 2015 04:31:39 +0000 (22:31 -0600)]
pods: Discourage use of 'In' prefix for Unicode Block property
This changes perluniprops to not list the equivalent 'In' single form
method of specifying the Block property, and to discourage its use. The
reason is that this is a Perl extension, the use of which is unstable.
A future Unicode release could take over the 'In...' name for a new
purpose, and perl would follow along, breaking the code that assumed the
former meaning. Unicode does not know about this Perl extension, and
they wouldn't care if they did know.
The reason I'm doing this now is that the latest Unicode version
introduced some properties whose names begin with 'In', though no
conflicts arose. But it is clear that such conflicts could arise in the
future. So the documentation only is changed to warn people of this
potential.
perlunicode is update accordingly.
Tony Cook [Wed, 9 Sep 2015 23:51:43 +0000 (09:51 +1000)]
refine the skip test for the 32-bit x86 ABI brokeness
- the previous test checked ivsize, but that will be 8 for -Duse64bitint
- similarly, byteorder is sized to an iv, so loosen that check to just
make sure it's little-endian
- check we have something like an Intel FPU, this particular check would
give a false negative on MSVC, but these tests are skipped on Win32
anyway
Karen Etheridge [Wed, 9 Sep 2015 00:28:34 +0000 (17:28 -0700)]
fix list rendering in perlhack
At http://perldoc.perl.org/perlhack.html#TESTING, this was rendering as:
1
)
These select Unicode rules....
2
)
If you use the form...
For: RT #126021
Karl Williamson [Wed, 9 Sep 2015 17:52:34 +0000 (11:52 -0600)]
locale.c: Silence porting messages
This changes from using the standard C, generally unsafe, library
functions to using Perl safer alternatives. This code, only used in
debugging, really doesn't need that safety, but I had forgotten that
Perl makes it easy to add it, and it silences the warnings about using
the C functions from t/porting/libperl.t. Why this warning didn't
happen in smoking, I don't know.
Spotted by Dave Mitchell.
darksuji [Wed, 8 Apr 2015 01:44:48 +0000 (18:44 -0700)]
Make behavior of $Carp::MaxArgNums match docs
$Carp::MaxArgNums is supposed to be the number of arguments to display.
For a long time, Carp has instead shown $Carp::MaxArgNums + 1 arguments.
Correct the behavior by making it match the documentation. Also update
tests to make what's being tested more obvious.
Karl Williamson [Tue, 8 Sep 2015 19:18:58 +0000 (13:18 -0600)]
mktables: Fix --annotate option output
Special code suppressed the expanded output of some ranges, where it
would be clear from the range itself what was meant. However, for many
output tables, that range output was changed, so the desired
information is missing. For these tables, don't suppress the expanded
output.
James E Keenan [Tue, 8 Sep 2015 20:29:38 +0000 (16:29 -0400)]
Increment $VERSION in lib/locale.pm.
Karl Williamson [Fri, 4 Sep 2015 17:32:26 +0000 (11:32 -0600)]
Refactor tr/// parsing to work on EBCDIC, fix other bug
This expands the concept introduced for regular expressions in v5.22 of
a portable range, to the transliteration operators. A portable range
has at least one endpoint expressed as \N{} that indicates that the
Unicode definition is desired, or has the endpoints expressed as both
uppercase ASCII alphabetic letters or both lowercase ASCII alphabetics.
The refactor fixes several EBCDIC problems, and it fixes the problem in
all platforms wherein the first endpoint of a range was not checked to
be <= the final endpoint in UTF-8 strings.
There remains a bug in which if any transliterated code point is larger
than IV_MAX, perl loops.
Karl Williamson [Tue, 8 Sep 2015 04:18:55 +0000 (22:18 -0600)]
Slightly shorten most regex patterns
A compiled pattern requires a byte for each non-default modifier, like
/i. Previously, the worst case was presumed in allocating the space
(every modifier being non-default). Now, only the actual needed space
is reserved.
Karl Williamson [Mon, 7 Sep 2015 16:03:27 +0000 (10:03 -0600)]
t/loc_tools.pl: Fix some bugs in locales_enabled()
This code assumed that all locale categories were represented by
non-negative whole numbers. However, it turns out that this assumption
is wrong, as on AIX, LC_ALL is -1. This commit changes our assumption to
take into account that reality; it now assumes that all categories are
larger than a much more negative number, and now the new assumption is
tested for, and if wrong, the code dies instead of silently doing the
wrong thing.
There was also a bug where if a locale category wasn't defined on the
machine, but the corresponding #ifdef for using that category was still
set, the category was improperly assumed to exist
Karl Williamson [Tue, 8 Sep 2015 15:39:18 +0000 (09:39 -0600)]
lib/locale.t: Use 'chomp' not 'chop'
Karl Williamson [Tue, 8 Sep 2015 15:45:46 +0000 (09:45 -0600)]
lib/locale.t: sub ok() returns pass/fail
This file rolls its own TAP, and it did not have its ok() return
pass/fail.
Karl Williamson [Sun, 6 Sep 2015 16:24:45 +0000 (10:24 -0600)]
lib/locale.pm: Add an assertion
It turns out that the code assumes that the values for LC_CTYPE,
LC_MESSAGES, ... are small non-negative numbers, as a bit position is
reserved for each of these. It's better to make this assumption
explicit rather than getting hard-to-find failures.
(LC_ALL doesn't have to be of this form, and is in fact -1 on AIX)
Karl Williamson [Fri, 8 May 2015 21:19:56 +0000 (15:19 -0600)]
Add more -DL debugging info
This adds more stuff that gets dumped when debugging locale handling.
And it adds even more when the v modifier appears.
Karl Williamson [Tue, 8 Sep 2015 15:53:48 +0000 (09:53 -0600)]
Add code for debugging locale initialization
This initialization is done before the processing of command line
arguments, so that it has to be handled specially. This commit changes
the initialization code to output debugging information if the
environment variable PERL_DEBUG_LOCALE_INIT is set.
I don't see the need to document this outside the source, as anyone who
is using it would be reading the source anyway; it's of highly
specialized use.
Karl Williamson [Tue, 8 Sep 2015 15:52:57 +0000 (09:52 -0600)]
locale.c: Add clarifying comments
Tony Cook [Tue, 8 Sep 2015 01:03:16 +0000 (11:03 +1000)]
report missing authors to stderr
So if authors.t fails the cause is obvious from the test output
*without* having to perform a verbose run.
Karl Williamson [Sun, 6 Sep 2015 17:06:32 +0000 (11:06 -0600)]
embed.fnc: Add comment
Karl Williamson [Mon, 7 Sep 2015 03:44:54 +0000 (21:44 -0600)]
Test-Test.pm: Pedantic pod fixes
Karl Williamson [Thu, 4 Apr 2013 01:06:52 +0000 (19:06 -0600)]
Test::Test.pm: EBCDIC fixes
We are getting Perl working again for EBCDIC in v5.22. The changes here
are necessary to work for these platforms. For modern Perls, there is
one code path for both ASCII and EBCDIC platforms; this wasn't possible
to do for earlier versions.
One perhaps not obvious change is that [^:ascii:] doesn't include \177
which the earlier version does. However \177 was changed in the
substitute in the line above, so this change has no practical effect.
Karl Williamson [Tue, 8 Sep 2015 02:06:40 +0000 (20:06 -0600)]
Rmv trailing ';' on #endif
This creates a compiler warning on AIX
Craig A. Berry [Mon, 7 Sep 2015 18:49:14 +0000 (13:49 -0500)]
Skip aassign.t test under -DDEBUGGING on VMS.
-DDEBUGGING has never gone in $Config{ccflags} on VMS, so the
existing skip check failed to skip. There is, however, the
VMS-only $Config{usedebugging_perl} that is set to Y when DEBUGGING
is enabled. As likely as not, writing this variable to config.sh
was an ancient accident, and it's not mentioned in Porting/Glossary.
However, it's there and seems to be a reliable indicator of what
we've got, so in the absence of anything else reliable, use it.